Google Flu Tracker Not as Accurate, Reliable as CDC Data, Study Finds
March 14, 2014 in News
Flu tracking data gathered through CDC are far more reliable and accurate than information garnered through Google Flu Trends, despite the time lag present in the federal findings, according to a new study published in the journal Science, NPR’s “Shots” reports.
Background on Google Flu Trends
In 2008, Google created Google Flu Trends, which uses search algorithms to track flu activity based on individuals’ searches for flu-related terms (Harris, “Shots,” NPR, 3/13). After comparing its data with traditional flu surveillance systems, Google found that it could estimate the spread of the flu by tracking the influx of flu-related search terms (Walsh, Time, 3/13).
In the study, David Lazer — a professor of political science and computer science at Northeastern University — found that Google Flu Trends data in recent years have largely overestimated the number of flu cases when compared with federal data (“Shots,” NPR, 3/13).
Specifically, Lazer found that Google Flu Trends overestimated the prevalence of the flu:
- During the 2011-2012 and 2012-2013 flu seasons by more than 50%; and
- In 100 out of 108 weeks between August 2011 and September 2013.
According to the study, Google Flu Trends showed that 11% of U.S. residents had the flu during peak flu season last winter, while CDC reported that 6% of the population was affected.
Comments on Findings
Lazer offered several explanations for why the Google Flu Trends data were incorrect, such as:
- The tool’s reliance on the prevalence of flu search terms, which fail to reflect actual incidents of the flu and fail to account for unexpected events, such as the non-seasonal 2009 H1N1 flu;
- Google Flu Trends’ dependence on Google’s constantly changing algorithm;
- Its failure to include “small data” techniques in its “big data” approach, which would better reflect actual flu trends; and
- Google’s refusal to share how Google Flu Trends works, which limits how well the system can be refined by restricting how many scientists and researchers can collaborate on the program (Time, 3/13).
Lazer concluded that Google could improve its flu tracker tool by making its algorithm and research available to the public.
Overall, he said that the data collected and distributed by CDC — despite a time lag of a few weeks — still are more accurate than Google’s system.
Google did not provide a scientist for a detailed response.
However, the company did say that it has refined Google Flu Trends to reduce errors that occur when media coverage of the flu leads to a spike in the number of flu-related searches (“Shots,” NPR, 3/13).