The Influence of Reference Corpus Size on Wordsmith Tools Keywords Extraction
Keywords:
WordSmith Tools, KeyWords, Corpus Linguistics, reference corpus size.Abstract
A KeyWords analysis (using WordSmith Tools) enables the discovery of lexical items which reveal the main lexical sets in a text or corpus. Such an analysis requires that a reference corpus be compared to the corpus the researcher intends to describe (the study corpus). This paper presents a mathematical method for finding out the influence of reference corpus size on the number of key words extracted by the program. The results reveal that a reference corpus that is at least five times as large as the study corpus allows for drawing an amount of key words that is statistically equivalent to larger reference corpora, thus suggesting five times (as larger as the study corpora) as the minimum order of magnitude for reference corpora.Downloads
Issue
Section
Papers
License
The authors grant the journal all copyrights relating to the published works. The concepts issued in signed articles are the absolute and exclusive responsibility of their authors.
Esta obra está licenciada com uma Licença