FluBreaks: Early Epidemic Detection from Google Flu Trends

نویسندگان

  • Michelle Gatton
  • Fahad Pervaiz
  • Mansoor Pervaiz
  • Nabeel Abdur Rehman
  • Umar Saif
چکیده

BACKGROUND The Google Flu Trends service was launched in 2008 to track changes in the volume of online search queries related to flu-like symptoms. Over the last few years, the trend data produced by this service has shown a consistent relationship with the actual number of flu reports collected by the US Centers for Disease Control and Prevention (CDC), often identifying increases in flu cases weeks in advance of CDC records. However, contrary to popular belief, Google Flu Trends is not an early epidemic detection system. Instead, it is designed as a baseline indicator of the trend, or changes, in the number of disease cases. OBJECTIVE To evaluate whether these trends can be used as a basis for an early warning system for epidemics. METHODS We present the first detailed algorithmic analysis of how Google Flu Trends can be used as a basis for building a fully automated system for early warning of epidemics in advance of methods used by the CDC. Based on our work, we present a novel early epidemic detection system, called FluBreaks (dritte.org/flubreaks), based on Google Flu Trends data. We compared the accuracy and practicality of three types of algorithms: normal distribution algorithms, Poisson distribution algorithms, and negative binomial distribution algorithms. We explored the relative merits of these methods, and related our findings to changes in Internet penetration and population size for the regions in Google Flu Trends providing data. RESULTS Across our performance metrics of percentage true-positives (RTP), percentage false-positives (RFP), percentage overlap (OT), and percentage early alarms (EA), Poisson- and negative binomial-based algorithms performed better in all except RFP. Poisson-based algorithms had average values of 99%, 28%, 71%, and 76% for RTP, RFP, OT, and EA, respectively, whereas negative binomial-based algorithms had average values of 97.8%, 17.8%, 60%, and 55% for RTP, RFP, OT, and EA, respectively. Moreover, the EA was also affected by the region's population size. Regions with larger populations (regions 4 and 6) had higher values of EA than region 10 (which had the smallest population) for negative binomial- and Poisson-based algorithms. The difference was 12.5% and 13.5% on average in negative binomial- and Poisson-based algorithms, respectively. CONCLUSIONS We present the first detailed comparative analysis of popular early epidemic detection algorithms on Google Flu Trends data. We note that realizing this opportunity requires moving beyond the cumulative sum and historical limits method-based normal distribution approaches, traditionally employed by the CDC, to negative binomial- and Poisson-based algorithms to deal with potentially noisy search query data from regions with varying population and Internet penetrations. Based on our work, we have developed FluBreaks, an early warning system for flu epidemics using Google Flu Trends.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tracking Epidemics with State-space SEIR and Google Flu Trends

In this paper we use Google Flu Trends data together with a sequential surveillance model based on the state-space methodology, to track the evolution of an epidemic process over time. We embed a classical mathematical epidemiology model (a susceptible-exposed-infectedrecovered (SEIR) model) within the state-space framework, thereby allowing the SEIR dynamics to change through time. The impleme...

متن کامل

Flu Trends ” and Emergency Department Triage Data Predicted the 2009 Pandemic H 1 N 1 Waves in Manitoba

Objectives: We assessed the performance of syndromic indicators based on Google Flu Trends (GFT) and emergency department (ED) data for the early detection and monitoring of the 2009 H1N1 pandemic waves in Manitoba. Methods: Time-series curves for the weekly counts of laboratory-confirmed H1N1 cases in Manitoba during the 2009 pandemic were plotted against the three syndromic indicators: 1) GFT...

متن کامل

Monitoring Influenza Activity in the United States: A Comparison of Traditional Surveillance Systems with Google Flu Trends

BACKGROUND Google Flu Trends was developed to estimate US influenza-like illness (ILI) rates from internet searches; however ILI does not necessarily correlate with actual influenza virus infections. METHODS AND FINDINGS Influenza activity data from 2003-04 through 2007-08 were obtained from three US surveillance systems: Google Flu Trends, CDC Outpatient ILI Surveillance Network (CDC ILI Sur...

متن کامل

Correlation of “Google Flu Trends” with Sentinel Surveillance Data for Influenza in 2009 in Japan

Google Flu Trends (http://www.google.org/flutrends/) (GFT) aggregates Google search data to estimate flu activity in 28 countries. This study explored the correlation of GFT with Japanese national sentinel surveillance data in 2009. We obtained GFT and national sentinel surveillance data for influenza in all 47 Japanese prefectures from 29 June– 31 Dec 2009. Pearson correlation coefficients wer...

متن کامل

Using Google Trends for Influenza Surveillance in South China

BACKGROUND Google Flu Trends was developed to estimate influenza activity in many countries; however there is currently no Google Flu Trends or other Internet search data used for influenza surveillance in China. METHODS AND FINDINGS Influenza surveillance data from 2008 through 2011 were obtained from provincial CDC influenza-like illness and virological surveillance systems of Guangdong, a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2012