RESEARCHARTICLEOpenAccessUsinginternetsearchqueriesforinfectiousdiseasesurveillance:screeningdiseasesforsuitabilityGabrielJMilinovich1,2*,SimonMRAvril3,ArchieCAClements4,JohnSBrownstein5,ShiluTong1andWenbiaoHu1AbstractBackground:Internet-basedsurveillancesystemsprovideanovelapproachtomonitoringinfectiousdiseases.
Surveillancesystemsbuiltoninternetdataareeconomically,logisticallyandepidemiologicallyappealingandhaveshownsignificantpromise.
Thepotentialforthesesystemshasincreasedwithincreasedinternetavailabilityandshiftsinhealth-relatedinformationseekingbehaviour.
Thisapproachtomonitoringinfectiousdiseaseshas,however,onlybeenappliedtosingleorsmallgroupsofselectdiseases.
Thisstudyaimstosystematicallyinvestigatethepotentialfordevelopingsurveillanceandearlywarningsystemsusinginternetsearchdata,forawiderangeofinfectiousdiseases.
Methods:Officialnotificationsfor64infectiousdiseasesinAustraliaweredownloadedandcorrelatedwithfrequenciesfor164internetsearchtermsfortheperiod2009–13usingSpearman'srankcorrelations.
Timeseriescrosscorrelationswereperformedtoassessthepotentialforsearchtermstobeusedinconstructionofearlywarningsystems.
Results:Notificationsfor17infectiousdiseases(26.
6%)werefoundtobesignificantlycorrelatedwithaselectedsearchterm.
Theuseofinternetmetricsasameansofsurveillancehasnotpreviouslybeendescribedfor12(70.
6%)ofthesediseases.
Themajorityofdiseasesidentifiedwerevaccine-preventable,vector-borneorsexuallytransmissible;crosscorrelations,however,indicatedthatvector-borneandvaccinepreventablediseasesarebestsuitedfordevelopmentofearlywarningsystems.
Conclusions:Thefindingsofthisstudysuggestthatinternet-basedsurveillancesystemshavebroaderapplicabilitytomonitoringinfectiousdiseasesthanhaspreviouslybeenrecognised.
Furthermore,internet-basedsurveillancesystemshaveapotentialroleinforecastingemerginginfectiousdiseaseevents,especiallyforvaccine-preventableandvector-bornediseases.
BackgroundPrudentdetectionisacornerstoneinthecontrolandpreventionofinfectiousdiseases.
Traditionalinfectiousdiseasesurveillancesystemsaretypicallycharacterisedbyabottom-upprocessofdatacollectionandinforma-tionflow;thesesystemsrequireapatienttorecogniseillnessandseektreatmentandaphysicianorlaboratorytodiagnosetheinfectionandnotifytherelevantauthor-ity[1,2].
Foremerginginfectiousdiseaseevents,thisprocessisreportedtotake,onaverage,15daysfromon-settodetectionandafurther12–24hoursfortheWorldHealthOrganizationtobenotified[3].
Thedevelopmentandimplementationofmoreefficientsystemsforgath-eringintelligenceoninfectiousdiseaseshasthepotentialtoreducetheimpactofdiseaseevents.
Internet-basedsurveillancesystemsareonesuchsystem[4].
Internet-basedsurveillancesystemsproduceestimatesofdiseaseincidencethroughanalysisofvariousdigitaldata-sources.
Targetedsourcesincludeinternet-searchmetrics,onlinenewsstories,socialnetworkdataandblog/*Correspondence:gabriel.
milinovich@qut.
edu.
au1SchoolofPublicHealthandSocialWork,QueenslandUniversityofTechnology,Brisbane,Australia2InfectiousDiseaseEpidemiologyUnit,SchoolofPopulationHealth,TheUniversityofQueensland,Brisbane,AustraliaFulllistofauthorinformationisavailableattheendofthearticle2014Milinovichetal.
;licenseeBioMedCentral.
ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http://creativecommons.
org/licenses/by/4.
0),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycredited.
TheCreativeCommonsPublicDomainDedicationwaiver(http://creativecommons.
org/publicdomain/zero/1.
0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690DOI10.
1186/s12879-014-0690-1microblogdata[4].
Currently,themostpromisingap-proachappearstobethosebaseduponmonitoringofinternetsearchbehaviour.
Thisapproachworksonthepremisethatpeoplewillactivelyseekinformationondis-easestheydevelopandthatestimatesofdiseaseactivitywiththecommunitymaybedevelopedbymonitoringthefrequencyofrelatedinternetsearches.
Throughtargetingpeopleearlierinthediseaseprocess,internet-basedsystemsareabletoaccessalargerfractionofthecom-munityandproducemoretimelyinformation.
Further-more,internet-basedsurveillancesystemsareintuitiveandadaptable,cheaptorunandmaintain(onceestab-lished),donotrequireaformalpublichealthnetworkandhavethecapacitytobeautomatedandoperateinnear-realtime.
Despitetheseadvantages,internet-basedsurveillancesystemshaveanumberofsignificantshort-comingsandmustnotbeconsideredanalternativetotraditionalsurveillanceapproaches[5].
Firstly,asthesesystemscrowd-sourcedata,resolutionwillbecontin-gentonthesizeofthepopulationservicedandmaybefurtherlimitedbynationalcommunicationsinfrastructureavailabilityanddistribution[6].
Secondly,asinternet-basedsurveillancesystemsarelimitedtopeoplewhousetheinternettosourcehealthinformation,thereisthepotentialthatestimatesproducedbythesesystemsmaynotaccuratelyreflecttheentirecommunity[7].
Finally,asinternet-basedsurveillancesystemsessentiallyrelyuponself-reporting,biasmaybeintroducedthroughdifferencesininternetusagebetweensectorsofthecommunity(theelderly,forexample,maynotusetheinternetasasourceofhealthinformation,despitebeingahigh-riskgroupformanyinfectiousdiseases)and/orthroughmediadriveninterestinemergingdiseaseevents[4].
Infectiousdiseasessurveillancesystemshavebeende-velopedusinginternetsearchmetricstoestimateinci-denceofinfluenza(GoogleFluTrends)[8]anddengue(GoogleDengueTrends)[9].
Currently,operationalsys-temsthatutilisethisapproacharelimited,however,stud-iesofthepotentialforinternet-basedsurveillancehavebeenconductedforarangeofotherinfectiousdiseases,including:acuterespiratoryillness[7],AIDS[10],chicken-pox[11,12],cryptosporidiosis[13],dysentery[10],gastro-enteritis[11],Hepatitis[14],listeriosis[15],Lymedisease[16],methicillin-resistantStaphylococcusaureus[17],nor-ovirus[18],respiratorysyncytialvirus[6],rotavirus[19],scarletfever(Streptococcuspyogenes)[10,20],Salmonella[21],tuberculosis[10,22]andWestNilevirus[6].
Previousstudieshavefocusedonsinglediseases,orasmallnumberofdiseases,andthejustificationofthefocusonaparticu-lardiseasehasbeenspecifictoeachstudy.
Thepublishedresultshavelargelybeenpromising;however,todatetherehasbeennosystematic,generalizableanalysistoidentify-ingdiseasesthataresuitedtomonitoringthroughtheanalysisofinternet-searchmetrics.
Theunderpinninggoalofthisstudywastoprovidedirectionforfutureapproachestodevelopingdigitalsur-veillancesystems;suchasthedevelopmentofpredictivemodelsand/orintegrativesurveillancemodelsthatdrawuponmultipletraditionalanddigitaldatasourcetocreateestimatesofdiseasewithinthecommunity.
Thisstudy,however,didnotaimtodevelopactionablesurveillancesystems,producepredictivemodelsofinfectiousdiseasebasedoninternet-baseddataortoidentifythebestsearchtermsforuseinthesemodels.
Rather,thisstudyaimedtodeterminewhichdiseaseshavemostpromiseformonitor-ingbysurveillancesystemsbuiltoninternetsearchmet-rics;thiswasachievedbyassessingthelevelofcorrelationbetweenawiderangeofinfectiousdiseasesandinternetsearchtermmetrics.
Finally,thisstudyaimstoidentifydiseasesforwhichinternet-baseddatacouldbeusedtocreateearlywarningsystems.
MethodsInfectiousdiseasesurveillancedataSurveillancedataonnotifiableinfectiousdiseaseswerecol-lectedfromtheNationalNotifiableDiseaseSurveillanceSystem(NNDSS)whichismaintainedbytheAustraliaGovernmentDepartmentofHealth(DoH)[23].
Monthlynotifications(casenumbers)aggregatedatstate/terri-toryandnationallevel,weredownloadedfortheperiodofJanuary2004toSeptember2013.
Afulllistofnotifi-ablediseasesinAustraliaandcasedefinitionscanbeaccessedthroughtheDoHwebpage[24].
Sixty-fourdis-easesaremonitoredandthesearecategorisedintheNNDSSasbelongingtooneofeightgroups:blood-bornediseases;gastrointestinaldiseases;otherbacterialdiseases;quarantinablediseases;sexuallytransmissibleinfections;vector-bornediseases;vaccinepreventablediseases;andzoonoses.
Forthepurposeofconsistency,wehavereporteddiseasesaccordingtothesegroupings.
Whilstnotifiable,datawerenotdownloadedforhumanimmunodeficiencyvirusinfection/acquiredimmuno-deficiencysyndrome,Creutzfeldt–Jakobdiseaseorvari-antCreutzfeldt–JakobdiseasebecausesurveillanceforthesediseasesisnotperformedbyDoHorforsevereacuterespiratorysyndrome,becausereportingtotheDoHisinformal;assuch,thesediseasesarenotlistedontheNNDSS.
SearchtermselectionandscrapingofinternetsearchtrenddataIntheconstructionofGoogleFluTrendsmodel,theau-thorsidentifiedsearchtermsbyperformingcorrelationsbetweeninfluenza-likeillnessdatafromtheUSCDCandthetop50millionGooglesearchqueriesperformedintheUSoverthecorrespondingperiod[8].
Suchdataisnotavailabletothepublicandanalternativeapproachtoiden-tificationofsearchtermswasrequired;twoapproachesMilinovichetal.
BMCInfectiousDiseases(2014)14:690Page2of9wereused.
Firstlytermsrelatedtodiseases,theaetiologicalagentsandcolloquialisms(suchas"hep"forhepatitisor"flu"forinfluenza)weremanuallyidentified.
Secondly,GoogleCorrelate(www.
google.
com/trends/correlate)wasqueriedusingmonthlysurveillancedata(describedabove).
GoogleCorrelateprovidesalistofupto100searchtermsthatcorrelatemosthighlywiththequerydata.
Toaccountforpotentiallanguageshiftsthatmayhaveaffectedsearchbehaviour[4],thiswasperformedthreetimesusingsur-veillancedatacoveringtheperiods2004–13,2007–13and2011–13.
Upto300searchtermsweredownloadedfromGoogleCorrelateforeachnotifiabledisease(100searchtermsperperiodanalysed)andmanuallysorted;anytermrelatedtothequeriednotifiablediseasewasincluded,regardlessofthenatureofthepotentialassociationSuitabletermswerecombinedwiththemanuallyidenti-fiedsearchtermstocreatealistofsearchterms(seeAdditionalfile1).
Noattemptwasmadetofiltersearchtermsbaseduponbiologicalplausibility;anytermthatmaybeperceivedtohaveanyassociationwiththediseaseofinterestwasincluded.
SearchfrequenciesfortermsofinterestwerecollectedthroughGoogleTrends(www.
google.
com/trends/).
Alldataextractionswereperformedonthe22ndofOctober,2013.
GoogleTrendswasqueriedusingeachoftheiden-tifiedtermsatanationalandstate/territorylevelusingtheentiretimerangeavailable(2004–present).
GoogleTrendspresentssearchfrequencyasanormaliseddataserieswithvaluesrangingfrom0to100(with100repre-sentingthepointwiththehighestsearchfrequencyandotherpointsscaledaccordingly);functionalityforexport-ingsearchfrequencydataasa.
CSVfileisprovided.
Forthepurposeofprivacy,dataareaggregatedatadaily,weeklyormonthlylevel(orarerestrictedifthereisinsuf-ficientsearchvolume).
Thelevelofaggregationappliedisdeterminedbytheperiodanalysedandthesearchfre-quency;thelevelofaggregationisnotabletobespecifiedbytheuser.
Asthenotifiablediseasesurveillancedatausedwasinmonthlyformat,monthlyindicesofquerysearchfrequencieswererequired.
Monthlyindicesaredis-playedgraphicallybyGoogleTrendswhenqueryingpe-riodsgreaterthan36months;ratherthandownloading.
CSVfiles,ascriptwasdevelopedtoscrapedatafromtheGoogleTrendswebpage,allowingtheproblemsassociatedwiththelevelofdataaggregationtobeovercome.
DataanalysisAnalyseswereperformedatbothnationalandstatelevelsfortheperiod2009–13.
Asstate-levelsearchfrequencydatawerenotalwaysavailable,particularlyforlesscom-mondiseases(duetolowsearchfrequencyatthislevelofdisaggregation),correlationsbetweenstate-levelnotifica-tiondataandnationalsearchfrequencydatawerealsoperformed.
Owingtothelargenumberofcorrelationsperformedinthisstudy,Bonferroniadjustments[25]wereappliedtosignificancelevelsbytheequation1-(1-α)1/n;allp-valuesreportedinthisdocumentcorrespondtoone-tailedtests.
Spearman'srankcorrelationcoefficientswereusedtorankperformance.
Time-seriescrosscorrelationswereperformedtoas-sesslinearassociationsbetweendiseasenotificationsandGoogleTrendsearchindices.
CrosscorrelationswerecalculatedusinglagvaluesforGoogleTrendsdataran-gingfrom7to7.
Thisrangeallowedforassessmentofbiologicallyplausibleassociationsthatwererelevanttothedevelopmentofearlywarningsystems.
Crosscorre-lationswereperformedonnationaldatausingIBMSPSSversion21(SPSSInc;Chicago,IL,USA).
Seasonaldiffer-encingwasapplied(value1)toallanalysestoremovecyclictrends.
Whilstallavailabledata(2004–13)weredownloaded,analysesforthisstudywerefocusedonthemostrecentfiveyears(2009–13)aspreliminarydataanalysesindi-catedthatGoogleTrendsdatawerenotavailablepriorto2009fornumeroussearchterms(Figure1;panels2,4,9,12,16and17).
Additionally,shiftsinlanguageareknowntoaffectsurveillancesystemsbuiltupontextualdata[4].
Theshortenedperiod(2009–13)wasselectedtominimisetheeffectsoflanguageshifts.
However,thisperiodstillprovidestherequisite50pairsofobservationsforperformingcrosscorrelations[26].
ResultsInthissectionwediscussanalysesoftimeseriesdata.
Briefly,thetimeseriesanalysedweremonthlycasenumbersforthe64infectiousdiseasesmonitoredbytheAustralianGovernment'sNationalNotifiableDiseaseSurveillanceSystem(NNDSS)andGoogleTrendsmonthlysearchmetricsforrelatedinternetsearchterms.
Intotal,search164termswereanalysedinthisstudy;thisrangedfromasingletermforsomediseases,upto14searchtermsforinfluenzaand35searchtermsforpneumococcaldisease.
Themajorityoftermscouldbecategorisedasdiseasesoraetiologicalagents("brucellosis"or"Brucella"),colloquialisms("flu","hep"or"TB"),symptoms("cough","whitedischarge"or"cervicalmucus")ormedicationorgeneralhealth/treatmentrelatedqueries("whoopingcoughtreatment","symptomsofdengue"or"fluandpregnancy").
Afewtermsthatmayhaveenvironmental("flashfloods"forleptospirosis)orbehavioural("Africantours"formal-aria)meaningswerealsoincluded.
Afulllistofthesearchtermsanalysedispresentedinthesupplementarymaterial.
Spearman'scorrelationsEvaluationofthebivariateassociationsbetweensurveil-lanceandcorrespondingsearchfrequencydatawasper-formedusingtheSpearman'srankcorrelation.
Spearman'srankcorrelationsforthe18toprankednotifiablediseasesMilinovichetal.
BMCInfectiousDiseases(2014)14:690Page3of9Figure1Topinternetsearchtermsanalysedfor18diseaseswiththehighestSpearman'srhovalues(2009–13).
Nationalmonthlycasenumbers(blue)andAustralianGoogleTrendsearchindex(red).
GoogleTrendsearchtermsusedintheanalysisarepresentedinFigure2.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page4of9andtermsarepresentedinFigure2andrawdataforthecorrespondingdiseasesandsearchtermsarepresentedinFigure1.
ResultsofSpearman'scorrelationsindicated17diseasestobesignificantlycorrelated(pGoogleTrends'data)hasbeenshiftedbackwardsoneunit(amonth).
Conversely,alagvalueof1indicatesthattheprimaryserieshadbeenshiftedforwardoneunit.
Signifi-cantpositivecorrelationsforlagvalesof≥1oraboveareofmostinterestinthecontextofthisstudyastheyindicateapositiverelationshipbetweenthetwotimeserieswithGoogleTrendsdataleadingthenotifications(apre-requisiteforGoogleTrendsdatatobeasuitableearlywarningtool).
Itshouldalsobenotedthatseasonaldiffer-encingwasappliedtocrosscorrelationstoremovecyclicseasonaltrends.
Diseasenotificationspositivelycorrelatedatalagofonemonth(lag1)withsearchtermfrequencyfor12ofthe17diseasesthatexhibitedsignificantSpearman'srankcorrelations.
Overall,15ofthe64notifiablediseasesexhibitedsignificant,positivecorrelationsatlagofonemonth.
Significantpositiveassociationswereobservedforfouroftheninevector-bornediseases(BarmahForestvirusinfection,Denguevirusinfection,MurrayValleyencephalitisvirusinfectionandRossRivervirusinfection),sixofthe14vaccinepreventablediseases(Haemophilusinfluenzaetypeb,influenza,pertussis,pneumococcaldiseaseandvaricellazoster(chickenpoxandshingles)),twoofthesixblood-bornediseases(hepatitisB(unspecified)andC(unspecified)),twoof11gastrointestinaldiseases(campylobacteriosisandcryptosporidiosis)andonezoonosis(leptospirosis).
Positivesignificantcorrelationswerenotobservedatalagofonemonthforanyofthequarantinablediseases(n=6),sexuallytransmissibleinfections(n=6)orotherbacterialinfections(n=4).
Itshouldbenotedthatposi-tivesignificantcorrelationswereobservedatlagsofoveronemonth(butnotatlag1)fortwoofthetopranked18diseases(gonococcalinfectionandmeningo-coccaldisease)and16diseasesoverall(seeAdditionalfile1).
Additionally,theterms"haemolyticuraemicsyndrome"and"leprosy"exhibitedsignificantnegativecorrelationswiththerespectivediseasenotificationsatalagofonemonth.
Figure2Spearman'srhovaluesforthe18toprankednotifiablediseasesfortheperiod2009–13.
Thetableonlycontainsthesearchtermwiththehighestdegreeofcorrelationforeachdisease;seeAdditionalfile1forafulllistofdiseases,searchtermsandcorrelationcoefficients.
ThecolumnlabelinboldindicatestheGoogleTrendsdatausedandsubheadingsinitalicsindicatethediseasenotificationdataused.
CasenumbersareNationaltotalsfortheperiod2009–13.
Shadingdenotedstatisticalsignificance(one-tailed,Bonferronicorrected)at0.
0001(red),0.
001(orange),0.
01(yellow)and0.
05(green)levels.
Fordiseasegrouping,BB:Blood-bornediseases;GI:Gastrointestinaldiseases;Other;Otherbacterialdiseases;QD;Quarantinablediseases;STI:SexuallyTransmissibleInfections;VBD:Vector-borneDiseases;VPD:Vaccinepreventablediseases;Zoo:Zoonoses.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page5of9Figure3(Seelegendonnextpage.
)Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page6of9DiscussionThedevelopmentandapplicationofinternet-basedinfec-tiousdiseasesurveillancesystemshasthepotentialtoenhanceinfectiousdiseasecontrolandprevention.
Whilstthisiswidelyrecognised[4,6,7,12,15,16,18,20]theinvesti-gationandapplicationofinternet-basedsurveillancehasnotbeensystematicallyappliedacrossinfectiousdiseases;thelackofsystemicknowledgeregardingthepotentialbreadthofinternet-basedsurveillanceappearstohaverestrictedthedevelopmentofsystemstoasmallnumberofdiseases.
Toourknowledge,assessmentsoftheuseofinternet-basedsurveillancehaveonlybeenperformedforfiveofthe17diseasesthatweredemonstratedtohaveasignificantassociationwithinternetsearchterms(influ-enza[4],dengue[9,27],chickenpox[11,12],hepatitisB[14]andcryptosporidiosis[13]–theauthorsofthefinalstudywere,however,notabletodetectsignalsfrominternetsearchqueries).
Ourstudysuggeststhatinternet-basedsurveillancesystemshavepotentialapplicationtoawiderrangeofdiseasesthaniscurrentlyrecognised.
How-ever,correlationsaloneshouldnotbeviewedasdefinitiveevidencethatsuchsystemsareviable;somediscretionmustbeapplied,particularlyastheanalysesperformedwereunivariate.
Correlationsbetweeninternetmetricsandbothgonococcalinfectionandchlamydia(Figure1,boxes2and7)werehigh;thisappearstobeduetoagen-eralupwardtrendinbothandinternetmetricsappearstohavelittlevalueindetectingperturbationsincasesbeyondthis.
Thisissupportedbythecrosscorrelationresults(whichareseasonallydifferenced);despitebeingranked2ndand7thbySpearmanrho(Figure2),nopositivecorrelationswereobservedforthesedisease/searchtermcrosscorrelations,evenatlag0(Figure3).
Furtherre-searchneedstobeperformed;however,thisstudysug-gestssurveillancesystemsbuildoninternetsearchdatatohavesignificantpromiseforanumberofdiseasesbeyondthosepreviouslydescribed,mostnotablypneumococcaldisease,RossRivervirusinfection,pertussis,BarmahForestvirusandinvasivemeningococcaldisease.
Theapplicationofinternet-baseddatatomonitoringsystemsofinteresthasbeentermed"nowcasting";thisapproachdoesnotpredicttheoccurrenceoffutureevents,butratherseekstoproducemoretimelyinformationonthesystemsofinterest[28].
Forinfectiousdiseasesurveil-lance,thisistypicallyachievedthroughtheabilityofinternet-basedsurveillancesystemstocollectdataatanearliertimepointthanispossiblefortraditionalsystemsorbycircumventingbureaucraticstructuresinherenttotraditionalsystemsthatimpedeinformationflow[4].
Searchtermsthatexhibitahighlevelofcorrelationwithdiseasenotificationsareofvalueastheymaybeusedtoprovidefasterintelligenceonemergingdiseaseevents.
Resultsofcrosscorrelations(Figure3),however,indi-catedthatforecastingofinfectiousdiseaseeventsmayalsobepossibleusinginternet-baseddata.
Ofthe17dis-easesthatexhibitedsignificantSpearman'scorrelations,12alsohadsignificantpositivecrosscorrelationsatalagofonemonth.
Overall,crosscorrelationsindicatedthatforecastingofnotificationratesusinginternet-basedmet-ricswouldbemostrealisticforthevaccine-preventableandvector-bornediseases.
Despitesearchtermsofferingstrongorverystrongcorrelationsfortwoofthesexuallytransmissiblediseases,neitherexhibitedsignificantcorre-lationsatalagofonemonth.
Whilstinternetmetricsmayprovidevaluableinforma-tionregardingdiseasestatus,itisimportanttoviewthesewithincontext.
Theterm"denguemosquito"(Figure3,panel6)leadsnotificationsbyuptoonemonth.
Thedataimplydependenceofdenguenotificationsonsearchesfortheterm"denguemosquito".
Themechanismofthisde-pendenceismorelikelythatenvironmentalconditionsthatincreasetheabundanceofmosquitosindengueriskareascorrelatewithbothanincreaseindenguenotifica-tionsandincreasedsearchinterestfor"denguemosquito",allowingthesearchtermtobeusedasanindicatorforno-tifications.
Inthiscontexttheinternetmetricsalsoprovideinformationthatisofpotentialsignificancewithrespecttocontrolofdenguefever;thereisincreasedinterestre-gardingmosquitosinthecommunityandthismaybedrivenbyanincreaseinmosquitonumbers.
Converselytheincidenceofdiseaseinthecommunitymayalsoaffectsearchhabits.
Thesearchterm"chikungunya"lagsnotifi-cationsforchikungunyavirusinfection(Figure3,panel18).
Searchesfor"chikungunya"areprobablydrivenbymediaexposure.
Mediabiashaspreviouslybeenreportedtoadverselyaffectinternet-basedsurveillancesystems[27,29-33]andanincreaseincasesofadiseaseinthecommunitywilllikelyresultinthepublicationofstoriesaboutthediseaseinthemedia;inturn,mediaexposurewilldriveinternetsearchesonthetopic.
Theseprocesses,however,arenotnecessarilymutuallyexclusive.
Searchesforadiseasemayleadnotifications,however,increasednotificationsandreportingofanemergingdiseaseeventinthemediamayalsodriveinternetsearches.
Thecom-plexityofthisrelationshipmaymakeinterpretationofGoogleTrends'datamoredifficult.
Forpertussis(Figure3,(Seefigureonpreviouspage.
)Figure3Crosscorrelationresultsforthe18diseaseswiththehighestSpearman'srhovalues(2009–13).
Crosscorrelationsfortwosearchtermsaredisplayedforeachdisease.
ColouredbarscorrespondtothesearchtermwiththehighestSpearman'srhovalueforeachdisease(redbarsindicatevaluesthatexceedthe95%confidenceinterval,whereasbluebarsdonot).
Unfilledbarsindicatecrosscorrelationresultsforalternativesearchtermswithhighestcrosscorrelationvaluesatalagvalueof1.
Confidenceintervals(95%)areindicatedbythegreylines.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page7of9panel8),theterm"whooping"exhibitsasignificantposi-tivecorrelationwithdiseasenotificationsfromlag7throughtolag3.
Itappearsthatbothmechanismsoccurforthesameterm,demonstratingapotentialdifficultyininterpretingthesedata.
Itisimperativethatanytermsusedinthedevelopmentofforecastingmodelsareheav-ilyscreenedtoaddressthecomplexitiesofthedrivingforcesbehindhealth-informationseekingandroutinelyre-evaluatedtoaccountforanyshiftsinsearchbehav-iourwhichmayoccur[4].
Therewereanumberofobviouslimitationstothisstudy.
Thetemporalresolutionofthedatausedwasmonthly.
Internet-basedsurveillancesystemsbuiltuponmonthlydataareunlikelytoprovidebetterintelligencethanexistingtraditionalsurveillancesystems;thesecom-monlyrelyuponweeklyordailyreporting.
Thiswasafunctionoftheavailabilityofthenotificationdata.
Sec-ondly,theanalyseswereperformedforaspecificsetting:Australia.
Thenuancesoflanguagewillcreatediffer-encesintheapplicability,notjustfordifferentcountries,butalsowithinacountryandbetweendifferentsettings(suchasduringaninfluenzapandemic)[4].
Australiawasselectedasthestudyareabecauseinternetpenetra-tioninAustraliaisveryhigh(>80%)[34]anduseislargelyrestrictedtoasinglesearchengine;Googlemaintainsamarketshareofover90%inAustralia[35].
Thesefeaturesreducebiasesassociatedwithunequalpatternsofuseand/oraccess.
Additionally,owingtoitsextensivesize,Australiaexhibitsarangeofclimatesandvaryingenviron-mentalconditions,makingitsusceptibletoawiderangeofinfectiousdiseases,includingendemicandnon-endemicvector-bornediseases.
Additionally,Australiahasastrongpublichealthnetworkandcomprehensiveinfec-tiousdiseasesurveillancesystemswhichcompilehighqualitydataonarangeofdiseases.
Combined,thesefea-turesofinternetusageandavailability,infectiousdiseasesurveillancesystemsanddiseasessusceptibilitypatternsmakeAustraliaanidealsysteminwhichtostudythepo-tentialapplicationofinternet-basedsurveillancesystems.
Itishopedthatthisworkwillstimulatefurtherresearchintointernet-basedinfectiousdiseasesurveillancesystemsbeyondAustralia.
Evenwithinourownstudy,however,weobservedvariationincorrelationsbetweeninternetsearchmetricsanddiseasenotificationsforthevariousstates(Figure2).
Itisimperativetodevelopmodelsspecifictotheregionofinterestandtoassesstheperformanceofanyinternet-basedsystemagainsttraditionalsurveillancedataspecifictotheregionbeingmonitored.
Thirdly,thisstudyanalysedtheperformanceofonlysinglesearchtermsinestimatinginfectiousdiseasenotifications.
WhilstGooglehasnotrevealedthetermsutilised,ortheweightingsapplied,GoogleFluTrendsisreportedtoincorporatearound160searchterms[36].
Despiteusingonlyasinglesearchtermforeachanalysis,notificationsfor13diseaseswereidentifiedashavingastrongorverystrongcorrel-ationwiththeselectedsearchterms.
CompoundingthisisthefactthatBonferroniadjustmentswereappliedinasses-singsignificance.
BonferroniadjustmentshavepreviouslybeencriticisedforbeingoverlyconservativeandforincreasingtheoccurrenceoftypeIIerrors(falsenegatives)[25].
Assuch,whilstthisstudyprovidesabaseforfutureresearch,itwouldberemisstolimitfutureinvestigationstojustthesediseases.
Thisstudyidentifiednumerousinfectiousdiseasesofpublichealthsignificancethathadnotpreviouslybeenin-vestigatedtohavepotentialformonitoringusinginternet-basedsurveillancesystemsHowever,thisstudydidnotseektoproducerobust,accurate,internet-basedsurveil-lancesystemsorearlywarningsystemsthatareabletoproduceactionableandtimelydataforpublichealthunits.
Theaimofthisstudywastoidentifythediseasesforwhichthisispossibleandtofocusfutureresearcheffortsintothese.
Toachievethisaim,thisstudyusedunivariateanalysestodeterminetheusefulnessofinternetsearchmetricsformonitoringawiderangeofinfectiousdiseases.
Whilstthissimplisticapproachwasusefulforscreeningdiseases,itwillnotsufficeinmonitoringorforecastingincidence.
Futurestudiesshouldfocusondevelopingcompositeindexesincorporatemultiplesearchterms,ordatasources(suchasweatherdata).
Modelsbuiltinsuchamanneraremoreresilienttomedia-drivenbe-haviour,fear-basedsearchingandevolutionsinlanguage[4].
Internet-basedsurveillancesystemshavethepoten-tialtobeappliedtomorethanjustenumeratingdiseasecaseswithinthecommunityorpredictingtheonset,peakandmagnitudeofoutbreaks.
Internet-basedsys-temsalsohavevalueastoolsforplanningemergencydepartmentstaffingandsurgecapacity[31,37]orforhealthcareutilisation[38].
Futureresearchneedstoalsoinvestigatetoapplicationofinternet-baseddata;thegreatestchallengeinthisfieldmaynotactuallybecreat-ingmodelsforforecastingormonitoringdiseasewithinthecommunity,butratherapplyingandarticulatingthesignificanceinamannerthatisbeneficial.
ConclusionsInternet-basedsurveillancesystemshavebroaderapplic-abilityforthemonitoringofinfectiousdiseasesthaniscurrentlyrecognised.
Furthermore,internet-basedsur-veillancesystemshaveapotentialroleinforecastingofemerginginfectiousdiseaseevents.
AdditionalfileAdditionalfile1:CompletetablesofresultsforGoogleCorrelateSearches,GoogleTrendsdata,SpearmanCorrelationsandcrosscorrelations.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page8of9CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Authors'contributionsGJMandWHdevelopedtheoriginalideaforthisstudy.
DevelopmentofthescriptfordatacollectionwasperformedbySMRA.
DataanalysiswasperformedbyGJMwiththeassistanceofWH,JSB,STandACAC.
ThemanuscriptwasprimarilywrittenbyGJMwitheditorialadvicefromWH,SMRA,JSB,STandACAC.
Allauthorsreadandapprovedthefinalmanuscript.
AcknowledgmentsThesalaryforGJMwasprovidedthroughtheAustralianNationalHealthandMedicalResearchCouncil(grant#1002608)andtheAustralianResearchCouncil(grant#DP110100651).
ACACisfundedbyanAustralianNationalHealthandMedicalResearchCouncilSeniorResearchFellowship(#APP1058878).
JSBissupportedbygrant5R01LM010812-04fromtheNationalLibraryofMedicine.
WHisfundedbyaQueenslandUniversityofTechnologyVice-ChancellorSeniorResearchFellowship.
STisfundedbyaNHMRCSeniorResearchFellowship(#553043).
Authordetails1SchoolofPublicHealthandSocialWork,QueenslandUniversityofTechnology,Brisbane,Australia.
2InfectiousDiseaseEpidemiologyUnit,SchoolofPopulationHealth,TheUniversityofQueensland,Brisbane,Australia.
3Freelancedeveloper,Bundaberg,Australia.
4ResearchSchoolofPopulationHealth,ANUCollegeofMedicine,BiologyandEnvironment,TheAustralianNationalUniversity,Canberra,Australia.
5DepartmentofPediatrics,HarvardMedicalSchoolandChildren'sHospitalInformaticsProgram,BostonChildren'sHospital,Boston,USA.
Received:5December2014Accepted:9December2014References1.
Castillo-SalgadoC:Trendsanddirectionsofglobalpublichealthsurveillance.
EpidemiolRev2010,32(1):93–109.
2.
ZengX,WagnerM:Modelingtheeffectsofepidemicsonroutinelycollecteddata.
JAmMedInformAssoc2002,9:S17–S22.
3.
ChanEH,BrewerTF,MadoffLC,PollackMP,SonrickerAL,KellerM,FreifeldCC,BlenchM,MawudekuA,BrownsteinJS:Globalcapacityforemerginginfectiousdiseasedetection.
ProcNatlAcadSciUSA2010,107(50):21701–21706.
4.
MilinovichGJ,WilliamsGM,ClementsACA,HuW:Internet-basedsurveillancesystemsformonitoringemerginginfectiousdiseases.
LancetInfectDis2014,14(2):160–168.
5.
LazerD,KennedyR,KingG,VespignaniA:Bigdata.
TheparableofGoogleFlu:trapsinbigdataanalysis.
Science2014,343(6176):1203–1205.
6.
CarneiroHA,MylonakisE:Googletrends:aweb-basedtoolforreal-timesurveillanceofdiseaseoutbreaks.
ClinInfectDis2009,49(10):1557–1564.
7.
ValdiviaA,Lopez-AlcaldeJ,VicenteM,PichiuleM,RuizM,OrdobasM:MonitoringinfluenzaactivityinEuropewithGoogleFluTrends:comparisonwiththefindingsofsentinelphysiciannetworks-resultsfor2009–10.
Eurosurveillance:bulletineuropeensurlesmaladiestransmissibles=Europeancommunicablediseasebulletin2010,15(29):pii=19621.
8.
GinsbergJ,MohebbiMH,PatelRS,BrammerL,SmolinskiMS,BrilliantL:Detectinginfluenzaepidemicsusingsearchenginequerydata.
Nature2009,457(7232):1012–1014.
9.
ChanEH,SahaiV,ConradC,BrownsteinJS:Usingwebsearchquerydatatomonitordengueepidemics:anewmodelforneglectedtropicaldiseasesurveillance.
PLoSNeglTropDis2011,5(5):e1206.
10.
ZhouXC,ShenHB:Notifiableinfectiousdiseasesurveillancewithdatacollectedbysearchengine.
JZhejiangUniv-SCIC2010,11(4):241–248.
11.
PelatC,TurbelinC,Bar-HenA,FlahaultA,ValleronA:MorediseasestrackedbyusingGoogletrends.
EmergInfectDis2009,15(8):1327–1328.
12.
ValdiviaA,Monge-CorellaS:DiseasestrackedbyusingGoogletrends,Spain.
EmergInfectDis2010,16(1):168.
13.
AnderssonT,BjelkmarP,HulthA,LindhJ,StenmarkS,WiderstromM:Syndromicsurveillanceforlocaloutbreakdetectionandawareness:evaluatingoutbreaksignalsofacutegastroenteritisintelephonetriage,web-basedqueriesandover-the-counterpharmacysales.
EpidemiolInfect2014,142(2):303–313.
14.
ZhouX,LiQ,ZhuZ,ZhaoH,TangH,FengY:Monitoringepidemicalertlevelsbyanalyzinginternetsearchvolume.
IEEETransBiomedEng2013,60(2):446–452.
15.
WilsonK,BrownsteinJS:Earlydetectionofdiseaseoutbreaksusingtheinternet.
CanMedAssocJ2009,180(8):829–831.
16.
SeifterA,SchwarzwalderA,GeisK,AucottJ:Theutilityof"Googletrends"forepidemiologicalresearch:Lymediseaseasanexample.
GeospatHealth2010,4(2):135–137.
17.
DukicVM,DavidMZ,LauderdaleDS:Internetqueriesandmethicillin-resistantstaphylococcusaureussurveillance.
EmergInfectDis2011,17(6):1068–1070.
18.
DesaiR,HallAJ,LopmanBA,ShimshoniY,RennickM,EfronN,MatiasY,PatelMM,ParasharUD:NorovirusdiseasesurveillanceusingGoogleinternetquerysharedata.
ClinInfectDis2012,55(8):E75–E78.
19.
DesaiR,LopmanBA,ShimshoniY,HarrisJP,PatelMM,ParasharUD:UseofinternetsearchdatatomonitorimpactofrotavirusvaccinationintheUnitedStates.
ClinInfectDis2012,54(9):e115–e118.
20.
SamarasL,Garcia-BarriocanalE,SiciliaMA:SyndromicsurveillancemodelsusingWebdata:thecaseofscarletfeverintheUK.
InformHealthSocCare2012,37(2):106–124.
21.
BrownsteinJS,FreifeldCC,MadoffLC:Digitaldiseasedetection–harnessingtheWebforpublichealthsurveillance.
NEnglJMed2009,360(21):2153–2155,2157.
22.
ZhouX,YeJ,FengY:TuberculosissurveillancebyanalyzingGoogletrends.
IEEETransBiomedEng2011,58(8):2247–2254.
23.
NationalNotifiableDiseasesSurveillanceSystem.
[http://www9.
health.
gov.
au/cda/source/cda-index.
cfm]24.
Australiannationalnotifiablediseasesandcasedefinitions.
[http://www.
health.
gov.
au/internet/main/publishing.
nsf/Content/cdna-casedefinitions.
htm]25.
PernegerTV:What'swrongwithBonferroniadjustments.
BMJ:BritishMedicalJournal1998,316(7139):1236.
26.
BoxGE,JenkinsGM,ReinselGC:TimeSeriesAnalysis:ForecastingandControl.
NewJersey:Wiley;2008.
27.
AlthouseBM,NgYY,CummingsDA:Predictionofdengueincidenceusingsearchquerysurveillance.
PLoSNeglTropDis2011,5(8):e1258.
28.
ChoiHY,VarianH:PredictingthepresentwithGoogletrends.
EconRec2012,88:2–9.
29.
HulthA,RydevikG:Webquery-basedsurveillanceinSwedenduringtheinfluenzaA(H1N1)2009pandemic,April2009toFebruary2010.
Eurosurveillance:bulletineuropeensurlesmaladiestransmissibles=Europeancommunicablediseasebulletin2011,16(18):pii=19856.
30.
OrtizJR,ZhouH,ShayDK,NeuzilKM,FowlkesAL,GossCH:MonitoringinfluenzaactivityintheUnitedStates:acomparisonoftraditionalsurveillancesystemswithGoogleFlutrends.
PLoSOne2011,6(4):e18687.
31.
DugasAF,HsiehYH,LevinSR,PinesJM,MareinissDP,MoharebA,GaydosCA,PerlTM,RothmanRE:GoogleFlutrends:correlationwithemergencydepartmentinfluenzaratesandcrowdingmetrics.
ClinInfectDis2012,54(4):463–469.
32.
WattsG:Googlewatchesoverflu.
BMJ(Clinicalresearched)2008,337:a3076.
33.
McDonnellWM,NelsonDS,SchunkJE:Shouldwefear"flufear"itselfEffectsofH1N1influenzafearonEDuse.
AmJEmergMed2012,30(2):275–282.
34.
WorldTelecommunication/ICTIndicatorsDatabase2013(17thEdition).
[http://www.
itu.
int/en/ITU-D/Statistics/Pages/publications/wtid.
aspx]35.
StatCounterGlobalStats-Top5seachenginesinAustraliafrom2008to2013.
[http://gs.
statcounter.
com/#search_engine-AU-yearly-2008-2013]36.
CookS,ConradC,FowlkesAL,MohebbiMH:AssessingGoogleflutrendsperformanceintheUnitedStatesduringthe2009influenzavirusA(H1N1)pandemic.
PLoSOne2011,6(8):e23610.
37.
ArazOM,BentleyD,MuellemanR:UsingGoogleFluTrendsDatainForecastingInfluenza-Like–IllnessRelatedEmergencyDepartmentVisitsinOmaha,Nebraska.
TheAmericanjournalofemergencymedicine2014,InPress.
38.
SchusterNM,RogersMA,McMahonLFJr:Usingsearchenginequerydatatotrackpharmaceuticalutilization:astudyofstatins.
AmJManagCare2010,16(8):e215–e219.
Milinovichetal.
BMCInfectiousDiseases(2014)14:690Page9of9
ZJI又上新了!商家是原Wordpress圈知名主机商:维翔主机,成立于2011年,2018年9月启用新域名ZJI,提供中国香港、台湾、日本、美国独立服务器(自营/数据中心直营)租用及VDS、虚拟主机空间、域名注册等业务。本次商家新上韩国BGP+CN2线路服务器,国内三网访问速度优秀,适用8折优惠码,优惠后韩国服务器最低每月440元起。韩国一型CPU:Intel 2×E5-2620 十二核二十四线...
菠萝云国人商家,今天分享一下菠萝云的广州移动机房的套餐,广州移动机房分为NAT套餐和VDS套餐,NAT就是只给端口,共享IP,VDS有自己的独立IP,可做站,商家给的带宽起步为200M,最高给到800M,目前有一个8折的优惠,另外VDS有一个下单立减100元的活动,有需要的朋友可以看看。菠萝云优惠套餐:广州移动NAT套餐,开放100个TCP+UDP固定端口,共享IP,8折优惠码:gzydnat-8...
每每进入第四季度,我们就可以看到各大云服务商的促销力度是一年中最大的。一来是年底的促销节日活动比较多,二来是商家希望最后一个季度冲刺业绩。这不还没有到第四季度,我们看到有些商家已经蠢蠢欲动的开始筹备活动。比如素有低价VPS收割机之称的Virmach商家居然还没有到黑色星期五就有发布黑五促销活动。Virmach 商家有十多个数据中心,价格是便宜的,但是机器稳定性和速度肯定我们也是有数的,要不这么低的...
googlepr值为你推荐
360邮箱lin.long.an@360.com是什么邮箱文档下载如何 下载 文库文件文档下载手机下载的文件在哪里能找到文档下载怎样把手机里的文件直接下载或复制到U盘里我爱试用网电信爱玩4G定向流量包开通需要交费吗123456hd手机卡上出现符号hd怎么取消最土团购程序你好,请问你有团购网的程序吗站点管理站点名称是什么意思骑士人才系统公司要采购一套人才系统源码,看了一下骑士和嘉缘的,谁家的比较好一点呢?托就不要回答了。图文模块图文模块的标题栏填什么啊?
域名备案流程 贝锐花生壳域名 什么是域名地址 互联网域名管理办法 域名商 加勒比群岛 xfce 天猫双十一抢红包 免费ddos防火墙 admit的用法 hkg 100m独享 cdn加速是什么 鲁诺 免费网页空间 免费邮件服务器 万网主机管理 东莞主机托管 浙江服务器 可外链的相册 更多