The Compilation of an English Corpus of Biology: some remarks on scientific vocabulary

Authors

  • Purificación Sánchez Hernández

Keywords:

corpus, biology, scientific vocabulary

Abstract

This paper deals with the elaboration of a corpus of Biology texts designed as a whole but integrated by different sub-disciplines. The compilation of the corpus was carried out taking into account the credits that the different sub-areas are given in the University studies of Biology and also the scientific and social impact of the subjects. We have gathered a corpus of 2,500,000 words with a percentage of 84% of the total texts devoted to scientific journals and 16% to books. In the same way 70% of the texts compiled were from American sources and 30% from British ones. Initially we focused on the lexical aspects of the corpus. Firstly, we have shown the utility of our specialised Corpus as compared with a general English one. After compiling the texts we extracted the 150 most frequent words of each file. As expected, the highest frequencies are those corresponding to grammar forms. A selection of the technical and sub-technical terms in these files revealed that the highest lexical density is found in books. In the same way, we have proved that the design we have proposed serves its purposes, as far as the lexical terms are concerned.

Downloads

Issue

Section

Papers