Study: Researchers Blame Google Flu Trends Inaccuracies On Ongoing Algorithm Updates

Researchers at Northeastern University in Boston claim the regular overestimation of flu cases by Google Flu Trends can be blamed on Google’s ever-evolving algorithms and the inaccurate analysis of big data. Google Flu Trends often has been cited for incorrectly projecting the number of flu cases. In 2010, a University of Washington study claimed Google […]

Chat with SearchBot

google-health4-ss-1920

Researchers at Northeastern University in Boston claim the regular overestimation of flu cases by Google Flu Trends can be blamed on Google’s ever-evolving algorithms and the inaccurate analysis of big data.

Google Flu Trends often has been cited for incorrectly projecting the number of flu cases. In 2010, a University of Washington study claimed Google Flu Trends was 25 percent less accurate than reports from the Center for Disease Control (CDC). Just last year, Nature.com reported Google Flu Trends overestimated the number of people with influenza like illnesses (ILI) by nearly double.

Now, in a study published last week by ScienceMag.org, Northeastern University researchers claim the problem with Google Flu Trends revolves around Google’s constantly updated algorithms and the inability of big data to produce, “Valid and reliable data amenable for scientific analysis.”

At the start of The Parable of Google Flu: Traps in Big Data Analysis, the study references Nature.com’s report last year, highlighting how Google Flu Trends predicted more than double the number of doctor visits for ILI. The study goes on to claim:

Although not widely reported until 2013, the new Google Flu Trends has been persistently overestimating flu prevalence for a much longer time.

According to the study’s authors, David Lazer, Ryan Kennedy, Gary King and Alessandro Vespignani, “Because Google Flu Trends uses the relative prevalence of search terms in its model, improvements in the search algorithm can adversely affect Google Flu Trends’ estimates.”

The researchers write that CDC’s three-week old data does a better job of estimating flu prevalence than Google Flu Trends real-time data that uses search terms as indicators of flu activity.

David Lazer told NPR he believed Google Flu Trends estimations could be improved if Google was, “Less secretive about what exactly it’s doing to get its results.” Lazer’s study points out Google has never released the 45 search terms used to track Google Flu Trends.

Google Flu Trends ILI Estimates Compared to CDC Estimates:

Google Flu Trends vs CDC predictions

(Image credit: www.UVM.edu)


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Amy Gesenhues
Contributor
Amy Gesenhues was a senior editor for Third Door Media, covering the latest news and updates for Search Engine Land, MarTech and MarTech Today. From 2009 to 2012, she was an award-winning syndicated columnist for a number of daily newspapers from New York to Texas. With more than ten years of marketing management experience, she has contributed to a variety of traditional and online publications, including MarketingProfs, SoftwareCEO, and Sales and Marketing Management Magazine. Read more of Amy's articles.

Get the must-read newsletter for search marketers.