Wordsmith.org: the magic of words

Wordsmith Talk

About Us | What's New | Search | Site Map | Contact Us  

Previous Thread
Next Thread
Print Thread
Page 1 of 2 1 2
Joined: Mar 2000
Posts: 6,511
Carpal Tunnel
OP Offline
Carpal Tunnel
Joined: Mar 2000
Posts: 6,511

English Language Hits 1 Billion Words

By SUEVON LEE, Associated Press Writer

A massive language research database responsible for bringing words such as "podcast" and "celebutante" to the pages of the Oxford dictionaries has officially hit a total of 1 billion words, researchers said Wednesday.

Drawing on sources such as weblogs, chatrooms, newspapers, magazines and fiction, the Oxford English Corpus spots emerging trends in language usage to help guide lexicographers when composing the most recent editions of dictionaries.

The press publishes the Oxford English Dictionary, considered the most comprehensive dictionary of the language, which in its most recent August 2005 edition added words such as "supersize," "wiki" and "retail politics" to its pages.

Oxford University Press lexicographer Catherine Soanes said the database is not a collection of 1 billion different words, but of sentences and other examples of the usage and spelling.

"The corpus is purely 21st century English," said Judy Pearsall, publishing manager of English dictionaries. "You're looking at current English and seeing what's happening right now. That's language at the cutting edge."

As hybrid words such as "geek-chic," "inner-child" or "gabfest" increase in usage, Pearsall said part of the research project's goal is to identify words that have lasting power.

"English gets really creative, really fun. What we're putting in dictionaries is words that will stick around," she said.

Launched in January 2000, the Oxford English Corpus is part of the world's largest-funded language research project, costing $90,000-$107,000 per year.

It has helped identify how the spellings of common phrases have changed, such as "fazed by" to "phased by" or "free rein" to "free reign."

"Buck naked" increasingly has evolved to "butt naked."

The corpus collects evidence from all the places where English is spoken, whether from North America, Britain, the Caribbean, Australia or India, to reflect the most current and common usage of the English language.

___

On the Net:

Oxford Corpus, http://www.askoxford.com/oec

Joined: Jul 2000
Posts: 3,467
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Jul 2000
Posts: 3,467
The author needs to think for just a second or two about what she's saying. She is off by three orders of magnitude. It's a million, not a billion.


TEd
Joined: Jun 2002
Posts: 7,210
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Jun 2002
Posts: 7,210
million, billion, what country are we talking?


formerly known as etaoin...
Joined: Jul 2000
Posts: 3,467
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Jul 2000
Posts: 3,467
It's a million whether you are in France or Australia. 1 followed by 6 zeroes. Ten to the 6th power.

Billion is a different story. Some parts of the world define it incorrectly.


TEd
Joined: Mar 2000
Posts: 6,511
Carpal Tunnel
OP Offline
Carpal Tunnel
Joined: Mar 2000
Posts: 6,511
But:

One billion words?

If all the words in the Oxford English Corpus were laid out end to end (measuring on average 1cm), the total would stretch from London to New York, around 10,000 km. Because the corpus is a collection of texts, there are not one billion different words: the humble word 'the', the commonest in the written language, accounts for 50 million of all the words in the corpus!


http://www.askoxford.com/oec/mainpage/?view=uk

Joined: Jul 2000
Posts: 3,467
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Jul 2000
Posts: 3,467
I sit corrected.


TEd
#159136 04/26/06 09:12 PM
Joined: Jan 2001
Posts: 1,819
A
Pooh-Bah
Offline
Pooh-Bah
A
Joined: Jan 2001
Posts: 1,819
Re: the Oxford English Dictionary:

Isn't there a 3rd edition about to come out soon?

#159137 04/26/06 11:36 PM
Joined: Apr 2000
Posts: 10,542
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Apr 2000
Posts: 10,542
>Isn't there a 3rd edition about to come out soon?

not in the traditional (OED2) sense. the new edition is an ongoing, online effort.

Joined: Mar 2000
Posts: 6,511
Carpal Tunnel
OP Offline
Carpal Tunnel
Joined: Mar 2000
Posts: 6,511
Quote:

I sit corrected.




Well, not necessarily, TEd. I think you're much closer to the mark on actual number of individual words. The story is misleading (which is why I wrote "sort of" in the subject line). As much as I respect the folks at Oxford, this looks to me like a cheap effort at publicity. I still haven't figured out *what exactly the Corpus is counting.

Joined: Jul 2000
Posts: 3,467
Carpal Tunnel
Offline
Carpal Tunnel
Joined: Jul 2000
Posts: 3,467
The habea cees?


TEd
Page 1 of 2 1 2

Moderated by  Jackie 

Link Copied to Clipboard
Forum Statistics
Forums16
Topics13,913
Posts229,372
Members9,182
Most Online3,341
Dec 9th, 2011
Newest Members
Ineffable, ddrinnan, TRIALNERRA, befuddledmind, KILL_YOUR_SUV
9,182 Registered Users
Who's Online Now
0 members (), 220 guests, and 2 robots.
Key: Admin, Global Mod, Mod
Top Posters(30 Days)
Top Posters
wwh 13,858
Faldage 13,803
Jackie 11,613
wofahulicodoc 10,561
tsuwm 10,542
LukeJavan8 9,919
AnnaStrophic 6,511
Wordwind 6,296
of troy 5,400
Disclaimer: Wordsmith.org is not responsible for views expressed on this site. Use of this forum is at your own risk and liability - you agree to hold Wordsmith.org and its associates harmless as a condition of using it.

Home | Today's Word | Yesterday's Word | Subscribe | FAQ | Archives | Search | Feedback
Wordsmith Talk | Wordsmith Chat

© 1994-2024 Wordsmith

Powered by UBB.threads™ PHP Forum Software 7.7.5