Saturday, June 25, 2011

Why You Should Care About Ngrams

I first came across the term 'Ngram' a few weeks ago, when I  told you about Google Labs.  I did not mention the Google Books Ngram Viewer in that post, mainly because at the time I did not understand what an Ngram was and I didn't have the inclination to find out.  A few days later I saw not one, but two articles about the Ngram viewer and its benefit to searches. I sighed and accepted that I could ignore it no longer.

So, what is an Ngram?  The dictionary.com definition is clear as mud. Don't bother, unless you are in the computing field.  The Cyberskeptic's Guide to Internet Research got a bit closer to useful information, describing Google's Ngram Viewer as: "a data visualization tool that displays a graph showing how particular words or phrases (Ngrams) have occurred in a body of books (corpora) over selected years."

Basically an Ngram is a word or a phrase (or anything, but let's focus on words and phrases).  Remember all those books that Google has digitized over the years? The Ngram Viewer allows you to search all of them, in seconds, to see how often, and when, a particular word or phrase has appeared. Neat!

But why should you care? Check out the graph below, which plots uses of the terms global warming and climate change since 1990. If you are doing research on this topic, but searching only for 'global warming' you will miss much of the newest information. The term 'climate change' seems well on its way to replacing the term 'global warming'. 

Watch this space for tips on how to get the most from Ngrams (as soon as I figure them out myself). In the meantime, see the About Google's Ngram Viewer  page for a boatload of information about how exactly all this works.

0 comments: