they are the most frequently used letters - as most of the code-breakers below have appreciated


Now, hold it here a second, folks. It just occurs to me that I've always blithely accepted statements about "the most frequently used letters" and, very closely related, "the most commonly used words" in the English language.

How are these "facts" known??

OK, in a particular context for a certain period of time you can make a broad-sweep judgement. Also it's a lot easier these days to produce accurate mathematical counts for a number of documents, and then extrapolate from there.

But but but.. like all statistics, we don't have any absolutes. In many contexts, for instance, articles will be omitted, words abbreviated and slang introduced. Surely this makes a significant difference to word and letter counts. And another thing - language evolves over time. New words and phrases appear, and old ones fade away. Sometimes (as we know) this happens very quickly.

Codebreakers would certainly need to take account of such factors, as top secret messages are only going to say what has to be said, in as condensed a form as possible. Context is absolutely crucial, as that will dictate the most frequent phrases/words/abbreviations.

I'd therefore like to ask:
Who came out with these frequency figures, how were they determined, and when?

Nice simple question!