Wordsmith.org

English Language Hits 1 Billion Words

By SUEVON LEE, Associated Press Writer

A massive language research database responsible for bringing words such as "podcast" and "celebutante" to the pages of the Oxford dictionaries has officially hit a total of 1 billion words, researchers said Wednesday.

Drawing on sources such as weblogs, chatrooms, newspapers, magazines and fiction, the Oxford English Corpus spots emerging trends in language usage to help guide lexicographers when composing the most recent editions of dictionaries.

The press publishes the Oxford English Dictionary, considered the most comprehensive dictionary of the language, which in its most recent August 2005 edition added words such as "supersize," "wiki" and "retail politics" to its pages.

Oxford University Press lexicographer Catherine Soanes said the database is not a collection of 1 billion different words, but of sentences and other examples of the usage and spelling.

"The corpus is purely 21st century English," said Judy Pearsall, publishing manager of English dictionaries. "You're looking at current English and seeing what's happening right now. That's language at the cutting edge."

As hybrid words such as "geek-chic," "inner-child" or "gabfest" increase in usage, Pearsall said part of the research project's goal is to identify words that have lasting power.

"English gets really creative, really fun. What we're putting in dictionaries is words that will stick around," she said.

Launched in January 2000, the Oxford English Corpus is part of the world's largest-funded language research project, costing $90,000-$107,000 per year.

It has helped identify how the spellings of common phrases have changed, such as "fazed by" to "phased by" or "free rein" to "free reign."

"Buck naked" increasingly has evolved to "butt naked."

The corpus collects evidence from all the places where English is spoken, whether from North America, Britain, the Caribbean, Australia or India, to reflect the most current and common usage of the English language.

___

On the Net:

Oxford Corpus, http://www.askoxford.com/oec
The author needs to think for just a second or two about what she's saying. She is off by three orders of magnitude. It's a million, not a billion.
million, billion, what country are we talking?
It's a million whether you are in France or Australia. 1 followed by 6 zeroes. Ten to the 6th power.

Billion is a different story. Some parts of the world define it incorrectly.
But:

One billion words?

If all the words in the Oxford English Corpus were laid out end to end (measuring on average 1cm), the total would stretch from London to New York, around 10,000 km. Because the corpus is a collection of texts, there are not one billion different words: the humble word 'the', the commonest in the written language, accounts for 50 million of all the words in the corpus!


http://www.askoxford.com/oec/mainpage/?view=uk
I sit corrected.
Posted By: Alex Williams re: the OED - 04/26/06 09:12 PM
Re: the Oxford English Dictionary:

Isn't there a 3rd edition about to come out soon?
Posted By: tsuwm Re: re: the OED - 04/26/06 11:36 PM
>Isn't there a 3rd edition about to come out soon?

not in the traditional (OED2) sense. the new edition is an ongoing, online effort.
Quote:

I sit corrected.




Well, not necessarily, TEd. I think you're much closer to the mark on actual number of individual words. The story is misleading (which is why I wrote "sort of" in the subject line). As much as I respect the folks at Oxford, this looks to me like a cheap effort at publicity. I still haven't figured out *what exactly the Corpus is counting.
The habea cees?
>I still haven't figured out *what exactly the Corpus is counting.

here's my guess: the total number of words on all of those (figurative) little slips of paper in all of the cubbyholes in all of the editors rolltop desks in all of the many scriptoria, all of which hold the millions and millions of citations for the million English words.

exactly.
And here I thought "butt naked" was just a dumb thing the teens around here said!
Quote:

And here I thought "butt naked" was just a dumb thing the teens around here said!




A) What's "buck" sposed to mean anyway?

and

2) It's an example of, whaddaycallit, assimilation.
I don't know, either, what the 'buck' in 'buck naked' means.
Maybe a reference, from times past, to Indians' summer wear?
Anybody have a clue?
Michael sez
Quote:

Michael sez
Quote:

(the first part of that line brings to mind the much later expression buck naked, from buckskin, a similar sort of derivation)









Well, that was sort of a sideswipe at the question.

Here's a more comprehensive look.
Posted By: Faldage Re: Bu(tt,ck) Naked - 04/29/06 07:07 PM
Here's another take on the subject. Interestingly, the conclusion seems to be that "butt naked" just might be the original, and the "buck" a euphemism.
Posted By: Capital Kiwi Re: Bu(tt,ck) Naked - 05/03/06 04:26 AM
Could be a variation on "bare-assed", couldn't it?
© Wordsmith.org