Most common words in English

Studies that estimate and rank the most common words in English examine texts written in English. Perhaps the most comprehensive such analysis is one that was conducted against the Oxford English Corpus (OEC), a very large collection of texts from around the world that are written in the English language. A text corpus is a large collection of written works that are organised in a way that makes such analysis easier.

In total, the texts in the Oxford English Corpus contain more than 2 billion words.[1] The OEC includes a wide variety of writing samples, such as literary works, novels, academic journals, newspapers, magazines, Hansard's Parliamentary Debates, blogs, chat logs, and emails.[2]

Another English corpus that has been used to study word frequency is the Brown Corpus, which was compiled by researchers at Brown University in the 1960s. The researchers published their analysis of the Brown Corpus in 1967. Their findings were similar, but not identical, to the findings of the OEC analysis.

According to The Reading Teacher's Book of Lists, the first 25 words in the OEC make up about one-third of all printed material in English, and the first 100 words make up about half of all written English.[3] According to a study cited by Robert McCrum in The Story of English, all of the first hundred of the most common words in English are of Anglo-Saxon origin,[4] except for "people", ultimately from Latin "populus", and "because", in part from Latin "causa".

Some lists of common words distinguish between word forms, while others rank all forms of a word as a single lexeme (the form of the word as it would appear in a dictionary). For example, the lexeme be (as in to be) comprises all its conjugations (is, was, am, are, were, etc.), and contractions of those conjugations.[5] These top 100 lemmas listed below account for 50% of all the words in the Oxford English Corpus.[1]

100 most common words

A list of 100 words that occur most frequently in written English is given below, based on an analysis of the Oxford English Corpus (a collection of texts in the English language, comprising over 2 billion words).[1] A part of speech is provided for most of the words, but part-of-speech categories vary between analyses, and not all possibilities are listed. For example, "I" may be a pronoun or a Roman numeral; "to" may be a preposition or an infinitive marker; "time" may be a noun or a verb. Also, a single spelling can represent more than one root word. For example, "singer" may be a form of either "sing" or "singe". Different corpora may treat such difference differently.

The number of distinct senses that are listed in Wiktionary is shown in the Polysemy column. For example, "out" can refer to an escape, a removal from play in baseball, or any of 36 other concepts. On average, each word in the list has 15.38 senses. The sense count does not include the use of terms in phrasal verbs such as "eat out" (chastise) and other multiword expressions such as the interjection "get out!", where the word "out" does not have an individual meaning.[6] As an example, "out" occurs in at least 560 phrasal verbs[7] and appears in nearly 1700 multiword expressions.

The table also includes frequencies from other corpora, note that as well as usage differences, lemmatisation may differ from corpus to corpus - for example splitting the prepositional use of "to" from the use as a particle. Also the COCA list includes dispersion as well as frequency to calculate rank.

WordParts of speechOEC rankCOCA rank[8]Dolch levelPolysemy
theArticle11Pre-primer 12
beVerb22Primer 21
toPreposition37, 9Pre-primer 17
ofPreposition44Grade 1 12
andConjunction53Pre-primer 16
aArticle65Pre-primer 20
inPreposition76, 128, 3038Pre-primer 23
thatConjunction et al.812, 27, 903Primer 17
haveVerb98Primer 25
IPronoun1011Pre-primer 7
itPronoun1110Pre-primer 18
forPreposition1213, 2339Pre-primer 19
notAdverb et al.1328, 2929Pre-primer 5
onPreposition1417, 155Primer 43
withPreposition1516Primer 11
hePronoun1615Primer 7
asAdverb, conjunction, et al.1733, 49, 129Grade 1 17
youPronoun1814Pre-primer 9
doVerb, noun1918Primer 38
atPreposition2022Primer 14
thisDeterminer, adverb, noun2120, 4665Primer 9
butPreposition, adverb, conjunction2223, 1715Primer 17
hisPossessive pronoun2325, 1887Grade 1 6
byPreposition2430, 1190Grade 1 19
fromPreposition2526Grade 1 4
theyPronoun2621Primer 6
wePronoun2724Pre-primer 6
sayVerb et al.2819Primer 17
herPossessive pronoun29, 10642Grade 1 3
shePronoun3031Primer 7
orConjunction3132Grade 2 11
anArticle32(a)Grade 1 6
willVerb, noun3348, 1506Primer 16
myPossessive pronoun3444Pre-primer 5
oneNoun, adjective, et al.3551, 104, 839Pre-primer 24
allAdjective3643, 222Primer 15
wouldVerb3741Grade 2 13
thereAdverb, pronoun, et al.3853, 116Primer 14
theirPossessive pronoun3936Grade 2 2
whatPronoun, adverb, et al.4034Primer 19
soConjunction, adverb, et al.4155, 196Primer 18
upAdverb, preposition, et al.4250, 456Pre-primer 50
outPreposition4364, 149Primer 38
ifConjunction4440Grade 3 9
aboutPreposition, adverb, et al.4546, 179Grade 3 18
whoPronoun, noun4638Primer 5
getVerb4739Primer 37
whichPronoun4858Grade 2 7
goVerb, noun4935Pre-primer 54
mePronoun5061Pre-primer 10
whenAdverb5157, 136Grade 1 11
makeVerb, noun5245Grade 2 [as "made"] 48
canVerb, noun5337, 2973Pre-primer 18
likePreposition, verb5474, 208, 1123, 1684, 2702Primer 26
timeNoun5552Dolch list of 95 nouns 14
noDeterminer, adverb5693, 699, 916, 1111, 4555Primer 10
justAdjective5766, 1823 14
himPronoun5868 5
knowVerb, noun5947 13
takeVerb, noun6063 66
peopleNoun6162 9
intoPreposition6265 10
yearNoun6354 7
yourPossessive pronoun6469 4
goodAdjective65110, 2280 32
someDeterminer, pronoun6660 10
couldVerb6771 6
themPronoun6859 3
seeVerb6967 25
otherAdjective, pronoun7075, 715, 2355 12
thanConjunction, preposition7173, 712 4
thenAdverb7277 10
nowPreposition7372, 1906 13
lookVerb7485, 604 17
onlyAdverb75101, 329 11
comeVerb7670 20
itsPossessive pronoun7778 2
overPreposition78124, 182 19
thinkVerb7956 10
alsoAdverb8087 2
backNoun, adverb81108, 323, 1877 36
afterPreposition82120, 260 14
useVerb, noun8392, 429 17
twoNoun8480 6
howAdverb8576 11
ourPossessive pronoun8679 3
workVerb, noun87117, 199 28
firstAdjective8886, 2064 10
wellAdverb89100, 644 30
wayNoun, adverb9084, 4090 16
evenAdjective91107, 484 23
newAdjective et al.9288 18
wantVerb9383 10
becauseConjunction9489, 509 7
anyPronoun95109, 4720 4
thesePronoun9682 2
giveVerb9798 19
dayNoun9890 9
mostAdverb99144, 187 12
usPronoun100113 6

Parts of speech

The following is a very similar list, subdivided by part of speech.[1] The list labeled "Others" includes pronouns, possessives, articles, modal verbs, adverbs, and conjunctions.

Rank Nouns Verbs Adjectives Prepositions Others
1 time be good to the
2 person have new of and
3 year do first in a
4 way say last for that
5 day get long on I
6 thing make great with it
7 man go little at not
8 world know own by he
9 life take other from as
10 hand see old up you
11 part come right about this
12 child think big into but
13 eye look high over his
14 woman want different after they
15 place give small her
16 work use large she
17 week find next or
18 case tell early an
19 point ask young will
20 government work important my
21 company seem few one
22 number feel public all
23 group try bad would
24 problem leave same there
25 fact call able their

See also

Word lists

References

  1. "The Oxford English Corpus: Facts about the language". OxfordDictionaries.com. Oxford University Press. What is the commonest word?. Archived from the original on December 26, 2011. Retrieved June 22, 2011.
  2. "The Oxford English Corpus". AskOxford.com. Retrieved June 22, 2006.
  3. The First 100 Most Commonly Used English Words Archived 2013-06-16 at the Wayback Machine.
  4. Bill Bryson, The Mother Tongue: English and How It Got That Way, Harper Perennial, 2001, page 58
  5. Benjamin Zimmer. June 22, 2006. Time after time after time.... Language Log. Retrieved June 22, 2006.
  6. Benjamin, Martin (2019). "Polysemy in top 100 Oxford English Corpus words within Wiktionary". Teach You Backwards. Retrieved December 28, 2019.
  7. Garcia-Vega, M (2010). "Teasing out the meaning of "out"". 29th International Conference on Lexis and Grammar.
  8. "Word frequency: based on 450 million word COCA corpus". www.wordfrequency.info. Retrieved 11 April 2018.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.