I’m a bit late in writing about this, but it’s too cool to pass up. Last week, Google launched a new service, Google Flu Trends, that really demonstrates the power of new sources of information in the digital age. Flu Trends attempts to warn users of regional outbreaks of the flu. This enables hospitals, medical practitioners, and individuals to prepare. It’s not exactly a new idea. The C.D.C. publishes reports on outbreaks of influenza, based on data compiled from heath care providers. Another web service, whoissick.org, combines user-reported illnesses with Google maps to show you the various bugs circulating in your area. But Google Flu Trends may identify outbreaks more quickly because of the unique data source that it uses.
A Google team noticed that certain search terms, like “flu symptoms”, are much more common during flu season. Only logical, right? Google employees created a list of these types of searches, and compared the date/location of past searches with C.D.C. data on influenza trends. It turns out that the number of people with the flu and the number of these types of searches are closely related. This means that analyzing the numbers of influenza-related searches on Google should provide an estimate of the number of flu cases. By looking at IP addresses, specific regions can be isolated. This is a very nifty sort of collective intelligence based on data that is simply a by-product of Google’s primary function.
The sheer number of Google searches makes them an excellent source of collective intelligence. Nielson Online estimated that 4.8 billion searches were made using Google in September 2008 in the US alone. That’s roughly 160 million U.S. searches per day. In addition to the volume of data, search engine results are fascinating sources of information because of their timeliness. As the New York Times wrote:
[...] the data collected by search engines is particularly powerful, because the keywords and phrases that people type into them represent their most immediate intentions. People may search for “Kauai hotel” when they are planning a vacation and for “foreclosure” when they have trouble with their mortgage. Those queries express the world’s collective desires and needs, its wants and likes.
To me, this is a cool usage of data that most people don’t even realize they are generating. However, the implications for privacy are a little bit frightening. If you don’t believe me (and you have a Google account), check your web history. Google saves every search you make, along with any web pages you visit from the search results. A year’s worth of search data can create a surprisingly complete picture of a person’s life. Imagine having access to that data for every single user. I don’t want to be pessimistic about this. For the most part, I think the possibility of exciting and useful projects like Flu Trends greatly outweighs the potential hazards of this data. What do you think?
You can read the New York Times report on Google Flu Trends here. The official Google Blogs also covered it here. Statistics on the number of U.S. Google searches were pulled from the Nielson Online news release available here.