Researchers at Northeastern University in Boston claim the regular overestimation of flu cases by Google Flu Trends can be blamed on Google’s ever-evolving algorithms and the inaccurate analysis of big data.
Google Flu Trends often has been cited for incorrectly projecting the number of flu cases. In 2010, a University of Washington study claimed Google Flu Trends was 25 percent less accurate than reports from the Center for Disease Control (CDC). Just last year, Nature.com reported Google Flu Trends overestimated the number of people with influenza like illnesses (ILI) by nearly double.
Now, in a study published last week by ScienceMag.org, Northeastern University researchers claim the problem with Google Flu Trends revolves around Google’s constantly updated algorithms and the inability of big data to produce, “Valid and reliable data amenable for scientific analysis.”
At the start of The Parable of Google Flu: Traps in Big Data Analysis, the study references Nature.com’s report last year, highlighting how Google Flu Trends predicted more than double the number of doctor visits for ILI. The study goes on to claim:
Although not widely reported until 2013, the new Google Flu Trends has been persistently overestimating flu prevalence for a much longer time.
According to the study’s authors, David Lazer, Ryan Kennedy, Gary King and Alessandro Vespignani, “Because Google Flu Trends uses the relative prevalence of search terms in its model, improvements in the search algorithm can adversely affect Google Flu Trends’ estimates.”
The researchers write that CDC’s three-week old data does a better job of estimating flu prevalence than Google Flu Trends real-time data that uses search terms as indicators of flu activity.
David Lazer told NPR he believed Google Flu Trends estimations could be improved if Google was, “Less secretive about what exactly it’s doing to get its results.” Lazer’s study points out Google has never released the 45 search terms used to track Google Flu Trends.
Google Flu Trends ILI Estimates Compared to CDC Estimates:
(Image credit: www.UVM.edu)