mentionedgoogleadsense

googleadsense  时间:2021-05-22  阅读:()
ViceROI:CatchingClick-SpaminSearchAdNetworksVachaDaveUCSanDiegovacha@cs.
ucsd.
eduSaikatGuhaMicrosoftResearchIndiasaikat@microsoft.
comYinZhangUniv.
ofTexasatAustinyzhang@cs.
utexas.
eduABSTRACTClick-spaminonlineadvertising,whereunethicalpublishersusemalwareortrickusersintoclickingads,siphonsohun-dredsofmillionsofadvertiserdollarsmeanttosupportfreewebsitesandapps.
Adnetworkstoday,sadly,relyprimarilyonsecuritythroughobscuritytodefendagainstclick-spam.
Inthispaper,wepresentViceroi,aprincipledapproachtocatchingclick-spaminsearchadnetworks.
Itisdesignedbasedontheintuitionthatclick-spamisaprot-makingbusinessthatneedstodeliverhigherreturnoninvestment(ROI)forclick-spammersthanother(ethical)businessmod-elstoosettheriskofgettingcaught.
Viceroioperatesattheadnetworkwhereithasvisibilityintoalladclicks.
Workingwithalargereal-worldadnetwork,wendthatthesimple-yet-generalViceroiapproachcatchesoversixverydierentclassesofclick-spamattacks(e.
g.
,malware-driven,search-hijacking,arbitrage)withoutanytuningknobs.
CategoriesandSubjectDescriptorsK.
6.
5[ManagementofComputingandInformationSystems]:SecurityandProtection—AdvertisingFraudKeywordsClick-Spam,Click-Fraud,InvalidClicks,TracQuality1.
INTRODUCTIONBackgroundandmotivation.
Click-spaminonlinead-vertising,whereunethicalpublishers1trickusersintoclick-ingadsorusemalwaretoclickonads,hurtstheonlineecon-omybysiphoningomillionsofadvertiserdollarsmeanttosupportfreewebsitesandapps[27].
Reputedadnet-works2attempttolterclick-spamtoincreaseadvertisers'WorkdonewhileatMicrosoftResearchIndia1Publishersarewebsites,apps,orgamesthatshowadsinexchangeforafractionoftherevenuegeneratedbyadclicks.
2Adnetworksaggregateadsfromadvertisersandbrokerthemtopublishers,e.
g.
,GoogleAdSense,BingAds,Baidu.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.
CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.
Abstractingwithcreditispermitted.
Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.
Requestpermissionsfrompermissions@acm.
org.
CCS'13,November4–8,2013,Berlin,Germany.
Copyright2013ACM978-1-4503-2477-9/13/11.
.
.
$15.
00http://dx.
doi.
org/10.
1145/2508859.
2516688.
condenceintheirnetwork[29].
Click-spam,however,isanarms-raceandattackshaveevolvedtoavoiddetection[4].
Adnetworkstodaylterclick-spamreactivelyandinanad-hocmanner—whenaspecicattackisdetected(oftenbytheimpactedadvertiser),theadnetworkscreatesaltertunedtothedetectedattack[29].
Forexample,ifanadver-tisercomplainsgettingthousandsofclicksfromasingleIPaddressnoneofwhichconvertintopayingcustomers,theadnetworkmaystartlteringallclicksfromthatspecicIPad-dress(orthat/24subnet).
Reactivelteringharmsadvertis-erssinceattacksmaygoundetectedformonths;inonecaseitwasestimatedthatclick-spammerssiphonedoatleast$14millionover4yearsbeforebeingtakendown[5].
Fur-thermore,ad-hocpoint-solutionsarequicklycircumventedbyattackers,e.
g.
,avoidingtheIPblacklistbyusingadis-tributedbotnet,potentiallyaddingmonthsbeforetheattackisrediscoveredbyamoresavvyadvertiser.
Acontrolledmeasurementstudyconductedin2012foundthatmajoradnetworksstillmissedongoingclick-spamattacksthatac-countedforanestimated10–25%ofclicksinthestudy[4].
Adnetworksrecognizetheirpoint-solutionstobeweakandrelyprimarilyonsecuritythroughobscurityforprotec-tion—theyercelyguardtheirlteringtechniquesinfearthat"unethical[parties]willimmediatelytakeadvantageofthisinformationtoconductmoresophisticatedfraudulentactivitiesundetectableby[theadnetwork]'smethods"[29].
Theevolutionofclick-spammalware,however,demonstratesthefutilityofrelyingprimarilyonsecuritythroughobscu-rity.
TheTDL4botnetforinstance,whichisestimatedtohavesiphonedmillions,avoidsthresholdbasedltersbyper-formingonlyoneclickperIPaddressperday(butdoingsofrommillionsofbots),avoidsbrowsersignaturebasedltersbypluggingintorealbrowsers,andavoidsuserbehaviorbasedltersbygatingmalwareactionsonuseractions[4].
Approachandcontributions.
WehadthreegoalsindesigningViceroi:1)proactivelylterclick-spamattacks,2)inawaythatevenafterwepubliclydiscloseourapproachitishard(butperhapsnotimpossible)forclick-spammerstocircumvent,and3)simpleandperformantenoughthatitcanbedeployedatInternetscales.
Briey,Viceroimeetsthesethreegoalsasfollows(Section4presentsdesignde-tails).
First,proactivelteringrequiresaverygeneralapproachthatmakesnoassumptionsaboutthespecicattackmech-anism.
Thusastheclick-spammerevolvestheattackmech-anismsovertime,thebasiclteringapproachremainsunaf-fected.
Atestofwhetheranapproachisgeneralenoughinpracticeistoseethediversityofattackstheapproachcandetectwithoutanytuningparameters.
Viceroidetectedsixverydierentclassesofongoingclick-spamattacks—includ-ingmalware-driven,search-hijacking,arbitrage,conversion-fraud,ad-injection,andparked-domains—withoutanytun-ingknobs.
Section6presentsdetailedcase-studies.
Second,publiclydisclosingtheViceroiapproachwithoutweakeningit(i.
e.
,rejectingtheawedadnetworks'prac-ticeofsecurityprimarilythroughobscurity)requiresustofocusoninvariants—somethingtheclick-spammercan-noteasilychangewithoutundermininghisbusinessmodel.
Viceroiisdesignedaroundasimpleinvariantthatweiden-tied—thataclick-spamattackmusthavehigherreturn-on-investment(ROI)fortheclick-spammerthanaethicalpublishertoosettheriskofgettingcaught.
Viceroi,inessence,agspublisherswithanomalouslyhighROI.
WhilepublisherROIishardtoestimate,inpracticewefoundper-userrevenueacloseproxy.
ToavoiddetectionbyViceroi,click-spammersmustreducetheirper-userrevenuetothatofanethicalpublisher.
Atwhichpoint,withouttheeconomicincentivetoosettheriskofgettingcaught(byapproachescomplementingViceroi),theneteectisadisincentivetocommitclick-spam.
Finally,tooperateatInternetscales,Viceroimustdealwithmassivevolumesofnoisyadnetworkdataecientlyandwithlowfalse-positives.
Viceroihasverygoodperfor-manceonROCandprecision-recallcurves(around90%typ-ically).
Furthermore,insimulatedattacks,Viceroihaswith-stoodattacksagainstadversariesseveraltimesmorepower-fulthantheonesknowntoday.
WebelieveViceroimeetsallthreegoalsasevidencedbyalargeadnetworkdeployingseveralaspectsofourapproach.
2.
TERMINOLOGYAtit'ssimplest,onlinesearchadvertisinghasthreeplay-ers;advertiserswhowanttoadvertiseaproductorservice,publishersthatrunwebsites(searchengines,newssites,blogsites),mobileappsandgamesthatdisplaytheads,andadnetworks(likeGoogleAdSense,BingAds,Baidu,andYa-hoo)thatconnectadvertiserswiththepublishers.
Therearetwokindsofpublishers:publishersownedandoper-ated(O&O)bytheadnetwork,e.
g.
,google.
comshowsadsfromGoogle'sadnetwork,andsyndicatedpublishersnotcontrolledbytheadnetwork,e.
g.
,ask.
comisasyndicatedpublisherforGoogleads.
Cost-per-click(CPC)orPay-per-click(PPC)isthedominantchargingmodelforsearchads—advertiserspaytheadnetworkonlywhentheiradisclicked.
Adnetworkstypicallypaysyndicatedpublishers70%oftherevenuegen-eratedbyadclicksontheirsite.
Whilethereareothercharg-ingmodels(e.
g.
,payperimpression,payperaction)wefocussolelyonpay-per-clicksearchadsinthispapersincetheydominateonlineadrevenuesandaregrowing[24].
Click-Spamisaclicktheadvertiserpaysforwheretheuserdidnotintendtovisittheadvertiser'spage.
Itin-cludesclicksthroughdedicatedclick-spammalware,acci-dentalclicks,clickswheretheuserwasconfusedortrickedintoclicking,caseswheretheuserclearlyintendedtogosomewhereelse(e.
g.
,anavigationalqueryforYouTubeinasearchengine)whichishijackedintoanadclickforsome-thingunrelatedtothequery,andsoon.
Syndicatedpublishers,becauseofthestrongnancialmo-tive,havebeenknowntofraudulentlygenerateclickstoin-atetheirpaychecks[5].
NotethatO&Opublishersare/0*1(#*)(2"(&3*40*56*)(2"(&3&*7,&*80*!
79(*:%3'*56*;0*<=9&A0*B(3")@&*7,&*C0*D$%EF&*=@*7,*G0*<=9&*E$%EF*H0*B(,%)(E3*/I0*1(#*)(2"(&3*//0*79(*/80*D$%EF&*#"J*/40*D'(EF="3*)(2"(&3*/;0*K'7@F&*79(*:%3'*E=56*/A0*D=L($*)(2"(&3*/C0*<=9&*D=M0*+,*NB<*)(2"(&3*+,*6()-()*Figure1:Anatomyofanadclickunlikelytoknowinglygenerateclick-spamsincetheadnet-workdefraudingitsowncustomers(theadvertisers)wouldgeneratemassivenegativePRresultinginadvertiserstakingtheirbusinesselsewhere.
Adnetworksfocusedontheirlong-termreputation(iftheyarecaughtbeingcomplicitinsyndicategeneratedclick-spam)aredriventolterclick-spamandoerdiscountstoadvertiserstoreducetheimpactofclick-spam.
Anatomyofaadclick.
Figure1showstheanatomyofaclick.
Adnetworksprovidepublishersalibrarywithwhichtofetchads.
ThismaybeJavaScriptthepublishercanembedintotheirwebsiteorapp,ormaybeserver-sidecode(e.
g.
,PHPorJava).
Asshowninstep3,theJavaScriptcode(runningintheuser'sbrowser)orPHP/Javacode(runninginthepublisher'swebserver—notshown)contactstheadnetwork'sservertofetchasetofadsthatitpopulatesthewebsite/appwith.
Thecodeidentiesthepublisherwhencontactingtheadnetwork'sserver,whichlogstherequest.
Thisiscalledanadimpression(step4).
Eachadreturned(step5)containsauniqueidentierthatisusedfortrackingclicksonthatad.
Iftheuserclicksthead(step6),theuser'sbrowser(orapp)makesanHTTPrequesttotheadnetworkwiththeuniqueidentierforthatadimpression(step7),whichlogstheadclick(step8).
Thelogrecordcontainstheuniqueidentier,whichisusedtolookuptheadadvertiserthatwillbecharged(andhowmuch),thepublisherthatwillbepaid,theuserthatoriginallyfetchedthead(forfrauddetection),andsoon.
Everysingleadclickisloggedbytheadnetwork.
TheHTTPresponsetotheaboverequestredirectsthebrowsertotheadvertiser'swebpage(typicallyusingHTTP302responsecode;steps9–11).
Theadnetworkcannot,ingeneral,tracktheuser'sactivityontheadvertisersite.
AnadvertisercanchoosetoembedJavaScriptcodeprovidedbytheadnetworkintocertainpages(e.
g.
,paymentcon-rmationpage,mailinglistsubscriptionpage).
When(if)theuservisitsthemarkedpages(step14),theJavaScriptinformstheadnetworkthattheuserhasperformedsomeactiondeemeddesirablebytheadvertiser(step15).
Thisiscalledanadconversion.
Theadnetworkusescookiestolinkconversioneventsbacktotheuniqueidentieroftheadimpressionandadclick(step16);theconversioneventmaytakeplacehoursordaysaftertheoriginaladclick.
Theadnetworkusestheconversionsignal(orlackthereof)toprovidebulkdiscountstotheadvertiserasperthesmart-pricingalgorithm[6,8].
Theintuitionbehindsmart-pricingisthatifapublishersendstracthatdoesn'tleadtode-sirableactions(likebuying,emailsignups),thenthetracisnotusefultotheadvertiser.
Thesmart-pricingalgorithmcomputesapenaltyscoreforsyndicatedpublishers.
Thelesserthetracconvertsforthatpublisher,thehigherthepublisher'spenaltyscore,andthemorethediscountoeredtoalladvertisersforclicksontheiradswhenshownonthatpublisher'ssite,andthusthelessthemoneypaidouttothatpublisher.
3.
RELATEDWORKAdnetworksandclick-spam.
Littleisknownabouthowadnetworksghtclick-spam.
CommissionedaspartofalawsuitsettlementbetweenadvertisersandGoogle,Tuzhilinin[29]reportsonhisexternalauditofGoogle'sclick-spamlteringsystemasofJuly2006.
Thesystempassivelyan-alyzeseveryadclickfromlogdata.
Itcontainsseverall-terseachtunedtocatchingaveryspecicattacksignature.
Viceroicanbeimplementedassuchalterinanadnetwork.
CharacterizingClick-spam.
Therehavebeenseveralstud-iesthatcharacterizethenatureofclick-spam,elaboratingonspecicattacks[1,2,20,21]aftertheyhavebeendetected.
Viceroidiersfromtheseasitagsmaliciouspublishers,regardlessoftheattackvectors.
Therehasalsobeensomeworkontracqualityprovidedbypurchasedtrac[28,32].
Abroadermeasurementstudyndsthatthecurrentgener-ationofclick-spamisgeneratedwithawidevarietyofbotandnon-botmechanismswhereusersaretrickedintoclick-ingonads[4].
Thestudy[4]usesbluads[10],whichisanactivemeasurementtechnique.
Viceroiontheotherhandispurelypassive.
Click-spamdetection.
Researchinclick-spamdetectionhasfocusedalmostexclusivelyon(earlygeneration)bots.
Sbotminer[31]detectssearchenginebotsbylookingforanomaliesinquerydistribution.
Others,suchasSleuth[19]anddetectives[18]detectunusualcollusionamongusers'associatedwithdiversepublishers(thatmaybeindicativeofbotbehavior).
PremiumClicks[13],Bluads[10]andUser-DrivenAccessControl[25]aimtoauthenticateuserpresence(asopposedtoautomatedbots)tomitigateclick-spam.
Viceroiisamoregeneralapproachthatproactivelytargetsallformsofclick-spamincludingnon-botmecha-nisms(likearbitrage,andsearch-hijacking)aswellasso-phisticatedbots.
SpamandClick-spam.
Alotofworkhasbeendonetounderstandthespamecosystem.
[14,17].
Whilebothspamandclick-spamareInternetabusesusedforproteering,theeconomicsofspamandclick-spamaredierent.
Click-spamthroughsearchhijackingrequiresthespammertopay18perinstall[3](i.
e.
,$180Kfor1Minstallsin2011),andnetsafractionoftheper-clickrevenue(typicallylessthan$1)per-userper-day.
Incontrastemailspamcostsalmostnothingtosendto1Musers($20torent[23]),andnets$30pervictim[17]per-campaign.
Spamneedsarealproductbeingpeddledandamarketforthesame,whileclick-spamdoesnot.
Thelow-marginmany-usersnatureofclick-spammakestheeconomicsfundamentallydierentfromthehigh-marginfew-victimsnatureoftraditionalspam.
4.
VICEROIDESIGNWebeginrstwiththeinsightbehindViceroi,followedbythedetaileddesign.
4.
1InsightAsmentioned,ourgoalistodesignaclick-spamlter-ingapproachthatdoesnotrelyonsecuritythroughobscu-rity,andcannoteasilybecircumventedbyclick-spammers.
Pastapproacheshavelookedforanomaliesinadimpressions,clicks,conversions,browsersignatures,timinganalysis,userbehavior,etc.
Unfortunately,noneofthesearetamper-proof—malwarethathascompletecontrolofacomputercanfakeanyofthesewithease.
Prot.
Forclick-spamtobeeconomicallyviable,theclick-spammermustturnaprot,i.
e.
,therevenuehecollectsfromeachclickmust(onaverage)coverhiscostsforgeneratingthatclick.
Generallyspeaking,thereisaxedcostandanincremen-talcost(per-click)fortheclick-spammer.
Click-spammersrentingbotnetstogenerateclicksmustpaythebotmaster.
Click-spammersusingcheaphumanlabortogenerateclicks(inclick-farms)mustpaytheworkers.
Click-spammersus-ingarbitragetogenerateclicks(describedlater)mustpayforcheapadsonasecondadnetworktoacquireusers.
Click-spammerslaunderingclicksfromadultsitesmustpaytheadultwebsitetoacquireclicks[11].
Whentheclick-spammerhascontrolovertheuseroruser'scomputer(e.
g.
,click-farmorbotnet),thexedcostforgettingthatcontrolishigh(typically,inthetensofcents[15,22])butthereislittleifanyincrementalcostsincetheclick-spammercangenerateasmanyclicksasneeded.
Whentheclick-spammerbuysindividualclicks(e.
g.
,arbitrage,clicklaundering),thereisaper-clickincremental-cost(typically,ontheorderof1[9])andlittleifanyxedcosts.
Revenuesareincremental,i.
e.
,theclick-spammermakesmoneyforeachadclick(thereisnoxedcomponent).
Theclick-spammerturnsaprotifhisxedcosts(amortizedoverallclicks)plushisincrementalcostsper-clickare(onaverage)lowerthanhisincrementalper-clickrevenue.
Risk.
Forclick-spamtobeeconomicallydesirable,theclick-spammermustturnahigherprotthananethicalpub-lishertoosettheriskoftheclick-spammergettingcaught.
Sinceclick-spammerscanbelegallypenalized[5],ifhewerenotmakinghigherprotsthananethicalpublisher,thenitwouldbestrictlysaferfortheclick-spammertomakethesameprotethicallyandnotruntheriskoflegalactions.
Potentialforhigherprotsgivesanclick-spammerthein-centivefortakinghigherrisk.
Insight:Aclick-spammerhashigherROIthanethicalpublishers.
Inasense,thishigherROIjustiesthehigherrisktheclick-spammermustcarry,regardlessofthespecicmechanismtheclick-spammerisusingtocommitclick-spam.
4.
2IntuitionAsdiscussedabove,thereareonlyfourvariablesthatcon-troltheclick-spammer'sprots:(i)xedand(ii)incremen-talcostsofgeneratingtheclick,(iii)thenumberofclicksthexed-costisamortizedover,andthe(iv)incrementalrevenueper-click.
Ofthesetheclick-spammercannotcon-trolhisxed-orincremental-costssincetheyaresetbytheundergroundmarketforpurchasingbots,cheaplabor,andclicks.
Tocommandhigherprotsthanethicalpub-0%"'1*#%2%3456'1*%&-7*6"5#$"%8*6'9:";1<<#$"%3=;#'41)/(%$#06/(%Figure2:IntuitionbehindViceroi.
Idealizedillustration(forclarity)basedonactualadnetworklogdata.
lisherstheclick-spammerhasexactlytwooptions.
First,toincreasethenumberofclickshisxed-costsareamortizedover.
Andsecond,toincreasehisincrementalrevenuebyclickingonmorelucrativeads.
Moreclicksaswellasclicksonmoreexpensiveads(per-user)resultsinmorerevenue(per-user)ascomparedtoethicalpublishers.
Initssimplestform,Viceroicouldlookforhigherthanexpectedrevenueperuserforagivenpublisher.
Puttingthisintopracticecomplicatesmattersslightly.
Therstcomplicationarisesfromthediversityinrev-enueperuserforethicalpublishers—thereisnosinglenumberthatcanserveasourbaseline.
Furthermore,thevastdierencebetweenpublishersizes(rangingfromindi-vidualblogsitestomulti-billiondollarcompanies)massivelyskewsthedata.
Surprisingly,wefoundfromdatacollectedatalargeadnetwork,thattherevenueperuserforadi-versesetofmanually-veriedethicalpublishers(includingasearchengine,severalblogsites,acontentportal,ane-commercewebsite,andajoblistingssite)allfallwithinanarrowrangeonalogscale,whilethatformanywell-knownclick-spammerslieswelloutsidethisrange.
Viceroithususesanexpectedlog-revenuerangeperuser(learneddynamicallyfromlabeleddata)asitsbaselineforethicalpublishers.
Thesecondcomplicationarisesfromclick-spammersusingamixofethicalandunethicalwaysofgeneratingclickstodisguisetheiroperation.
Forinstance,aclick-spammermayacquiresomeorganictracandsupplementitwithbottraf-c,ineectloweringhisoverallrevenueperuser.
Toaccountforthis,insteadofusingasinglenumber,Viceroicomparesthedistributionorrevenueperuseragainstabaselinedistri-bution.
AsillustratedinFigure2,theexpectedlog-revenuerangeisexpressedasabandaroundthebaselinedistribu-tion.
Thegureisanidealizedillustration(forclarity)basedonactuallogdata.
Inthegure,thesolidgreenlinesrepre-sentethicalpublishersandtheshadedregionrepresentsthebandaroundthisbaseline.
Theveriedethicalpublishers,wefound,agreenotonlyonthelog-revenuerange,theirdis-tributionsarefullycontainedwithinthebandaswell.
Whilemanyclick-spammersfalloutsidethebandeitherentirelyorinparts.
Anadnetworkcanchoosetoeitherdiscountclicksoutsidetheband,orallclicksfromagivenpublisher.
4.
3DetailedDesignViceroihastwocomponents:i)anoinepartthatan-alyzes(past)clicklogsovermultipletimescalestoidentifyclick-spammersandregionsintheirrevenueperuserdis-tributionthatareanomalous,andii)anonlinepartthatidentieswhetheragivenclickwouldfallintheanomalousregion(thusallowingthatclicktobediscountedatbillingtime).
Inputs.
Viceroirequiresadclicklogsthatcontainthepublisher,user,andrevenueforeachclick.
Inpracticewehavefoundgoodresultsforaslittleastwoweeksofpastclicklogs.
Viceroialsorequiresasmalldiversesetofpublishers(around10)tobeidentiedasethicalpublishers,whichareusedtodeterminethebaseline.
Algorithm.
Viceroiperformsthefollowingstepsinorder.
1.
Foreachpublisher-userpair,Viceroicomputesthelogofthesumofadclickrevenuesgeneratedbythegivenuseronthepublisher'ssite.
2.
Foreachpublisher,Viceroisortstheper-userlog-revenuesumsandretainsavectorofNquantilevalues.
Recallthatquantilevaluesaresampledatregularintervalsfromtheprobabilitydistributionfunction(PDF)ofarandomvariable.
InourevaluationwefoundN=100tooergoodperformancebeforediminishingreturnskicksin.
3.
Forthebaseline,Viceroicomputesthepoint-wiseav-erageofthequantilevectorsforthegivensetofethicalpublishers.
4.
Finally,foreachpublisherViceroicomputesthepoint-wisedierencebetweenthepublisher'squantilevectorandthebaselinequantilevector.
Thepublishersclick-spamscoreissimplytheL1normofthedierencevector(i.
e.
,sumoftheNpoint-wisedierences).
Givenathresholdτ(whichcharacterizesthewidthofthebandaroundthebaseline),iftheclick-spamscoreishigherthanNτthepublisherisagged,andallquan-tilepointswherethepoint-wisedierenceexceedsτisrecordedforuseintheonlinecomponent.
5.
Intheonlinecomponent,wheneveranadisclicked,Viceroichecksifthepublisherisaggedandtheuserclickingtheadfallsintheaggedquantileregion.
Ifso,theclickisdiscounted.
AutomaticParameterTuning(τ).
Toautomaticallylearntheoptimalvalueforτ,theadnetworkconguresatar-getfalse-positiverate(e.
g.
,0.
5%)andprovidessomela-beleddatathatcontainbothpositiveandnegativeclick-spamcases.
Thelabeleddatamaybeacombinationofmanualinvestigationsconductedbytheadnetwork,high-condenceoutputfromexistingadnetworklters,othersourcesofground-truthe.
g.
,Bluads[10],etc.
Viceroithenperformsaparametersweepfordierentvaluesofτandpickstheonethatmaximizesthenumberofclicksaggedgiventhehardconstraintonfalse-positives.
5.
DEPLOYMENTANDEVALUATIONWepartneredwithamajoradnetworktodeployandeval-uateourapproach.
Theadnetworkservesadstomanypub-lishersthatcatertobothgeneralandnicheaudiences.
The020406080100020406080100TruePositiveRate(%)FalsePositiveRate(%)(a)ROCCurve020406080100020406080100Precision(%)Recall(%)(b)Precision-RecallCurve020406080100020406080100QuantityPercentileQualityPercentile(c)RankingCurveFigure3:PerformancecharacteristicsofViceroi,viewedthroughdierentlenses,asweperformasweepoverthresholdvalues.
Arrowmarksthethresholdpickedbyoutauto-tuningalgorithmgivenamaximumacceptablefalse-positiverateof0.
5%.
Figure4:Bluadthatwerantoaugmenttheadnetwork'sground-truthheuristicwithactivemeasurements.
adnetworkhastwotiersofpublishers:premiumpublishers(boundbycontractsandSLAs),andself-servepublisherswhereanyonecansign-up.
Data.
Weusethepremiumpublishersasoursetofeth-icalpublisherstoestablishViceroi'sbaseline.
Viceroian-alyzedlogscontainingmillionsofadclickrecordscoveringathreeweekperiodinJanuary2013.
Eachadclickrecordcontainsthepublisher,user,revenue,andwhethertheadnetwork'sinternalground-truthheuristicconsidereditclick-spam.
Overall,therawdatasetcoversthousandsofuniquepublishersandmillionsofuniqueusers.
Weaugmentedtheadnetwork'sinternalground-truthheuristicusingBluads[10].
Bluadsareadswithnonsensecontent(e.
g.
,Figure4).
Few,ifany,usersareexpectedtointentionallyclickonbluads.
Andsincebluadshaven'tbeenadoptedyetbyanymajoradnetwork,click-spamat-tackshavenotyetevolvedtoavoidthem[4].
WeranbluadsafterwefoundViceroiaggingmanypublishersthattheadnetwork'sground-truthheuristicdidnotag.
Aftermanualinvestigation(withsomehelpfromtheadnetwork),wedeterminedtheaggedpublisherswereindeedengaginginclick-spam,andwewentaboutacquiringground-truththroughBluadstollgapsintheadnetwork'slabeling.
Overallourbluadshadover4.
3Mimpressionsandat-tracted7Kclicksfrom5.
6KuniqueIPaddressesand5.
8Kuniquereferringdomains.
Lastly,weuseinternaladnetworkmetricsontheper-formanceofnearlyhundredexistingltersalongtwoaxis:qualityandquantity.
Thelowerthefalsepositivescore,thehigherthequality.
Andthemoreclicksagged,thehigherthequantity.
WeusethistobenchmarkViceroiagainsttheindustry'sstate-of-the-art.
Parameters.
Theonlyparameterinourapproachisthemaximumacceptablefalsepositiverate(usedforautomati-callytuningthethresholdτ).
Weperformafullparametersweepinourevaluation.
Theadnetworkindicateditiscomfortablewithafalse-positiveratearound0.
5%,i.
e.
,thenetworkiswillingtonotchargefor0.
5%ofvalidclicks,ineectgivingadvertisersa0.
5%discountacrosstheboardaslongasViceroidemonstratessignicantlyhighertrue-positiverates.
EvaluationMetrics.
Weevaluateourapproachagainststandardmetricsforevaluatingbinaryclassiers—truepositiverate,falsepositiverate,precision,andrecall.
Atruepositive(TP)iswhenbothViceroiandground-truthagsapublisherasclick-spam;atruenegative(TN)issim-ilarlywhenbothagitasnotclick-spam.
Afalsepositive(FP)iswhenViceroiagsapublisherasclick-spamwhiletheground-truthdoesnot,andvice-versaforfalsenegative(FN).
Wetaketheconservativeapproachandcountallmis-classicationsagainstViceroieventhoughweareawarethattheground-truthdataisnotperfect.
WeadditionallyrankViceroi'sperformanceagainstexist-ingadnetworklters.
Evaluation.
Figure3showstheperformancecharacter-isticsofourapproachthroughvariouslenses.
Eachgraphconductsaparametersweeponthethresholdvalueτ.
Thearrowineachplotmarkstheoptimalvalueforτasselectedbyourauto-tuningapproachgivenamaximumacceptablefalse-positiverateof0.
5%.
Figure3(a)plotstheROCcurveforViceroiasthethresh-oldparameterisvaried.
Eachpointrepresentssomethresh-oldvaluegivenatargetfalsepositiverate3(onx-axis);they-valueisthetruepositiverate4atthatthreshold.
Thediag-onallinerepresentstheROCcurveforacompletelyrandomclassier.
Theidealoperatingpointistheupper-leftcorner.
Asisevidentfromthegure,Viceroiperformsquitewell—at0.
5%falsepositiverate,itachieves23.
6%truepositiverate.
Figure3(b)plotsViceroi'sPrecision-Recallcurveasthethresholdparameterisvaried.
Recall(sameastrue-positiverate)trackswhatfractionofclick-spamwecatch.
Precision5tracksthefractionoftruepositivesineverythingwecatch,i.
e.
,themorefalsepositivesweadmitforagivenrecall,thelowertheprecision.
Theidealoperatingpointisanywhereclosetothetopedge6.
Ourhighestprecisiononthedatasetis98.
6%atarecallof2.
5%.
Attheoperatingpointchosenbyourtuningalgorithmwehaveaprecisionof88.
3%andarecallof23.
6%.
3Falsepositiverate(FPR)=FPFP+TN4Recall=Truepositiverate(TPR)=TPTP+FN5Precision=TPTP+FP6NoteViceroicomplementsexistingadnetworklters.
Afalse-negativeforViceroi,whilesub-optimal,isacceptablebecauseanotherltercanstillagit.
Figure3(c)ranksViceroiagainsttheexistingadnetworklters.
Thex-valueofanypointisitsqualitypercentile,i.
e.
,thefractionofadnetworklterswithahigherfalsepositiveratethanthatapproach.
They-valueissimilarlythequan-titypercentile,i.
e.
,thefractionoflterscatchingfewerclicksthanthatapproach.
Theisolatedpointsplottherankingoftheadnetworklters,andthelineplotsViceroi'srankingaswevarythethreshold.
Thesolidgraydiagonallinesdividetheplotintothreeregions:pointsintheupper-rightregionarehighperformanceltersthatachieveeitherhighqualitypercentileandreasonablequantitypercentile,orviceversa.
Themiddleregionhasmoderateperformanceltersthatachievereasonablequalityandquantitypercentiles.
Andthelower-rightregionhastheremaininglowperformancelters.
Theidealoperatingpointisthetop-rightcorner,butthereisnoapproachthatsimultaneouslyhasthebestrankalongboththequalityandquantityaxis.
Thelterwiththehighestqualityscorehasquantitypercentileof12,whilethelterwiththehighestquantityscorehasaqualitypercentileof32.
FormostthresholdvaluesViceroioperatesinthehighperformanceregionofFigure3(c).
Attheoperatingpointchosenbyourauto-tuningalgorithm,Viceroihasaqualitypercentileof73andaquantitypercentileof98.
Thereisonlyoneexistingadnetworklterinourdatasetthatper-formsbetterthanViceroi(i.
e.
,totherightofthedotteddiagonallinepassingthroughthearrow).
Theltertargetsaveryspecicclick-spamattacksignatureintracorigi-natingfromaparticularIPaddressrange.
OverallwendthatViceroihasverygoodPrecision-RecallandROCcharacteristics,andattheoperatingpointpickedbyourauto-tuningalgorithmranksamongthebestexistingadnetworklterswhilebeingfarmoregeneral.
6.
CASE-STUDIESViceroiaggedaboutseveralhundredpublishersoutofthetensofthousandsprovided.
Workingwiththeadnet-workwemanuallyinvestigatedaroundhundredwebsitesas-sociatedwiththepublishersweagged.
BasedonmanualinvestigationsViceroiappearstohavecaughtatleastsix(very)dierentclassesofclick-spam(oneofwhichtheadnetworkhadpreviouslynotseenanexampleof),andcaughtatleastthreedierentpublishersineachclass.
SofarwehavemanuallyinvestigatedlessthanatenthofthewebsitesViceroiagged.
Wedidnotencounteranyobviousfalsepositivesoutofthepublishersweinvestigated,thoughthereareseveralwherewedonotfullyunderstandtheirmodusoperandiyet.
6.
1Conversion-SpamEnhancedClick-SpamWhat:Conversion-spamisatechniqueusedbyclick-spammerstoincreasethepotencyoftheirclick-spamattacksaswedescribebelow.
RecallfromSection2thatadconver-sioneventsareloggedwhenauserperformssomedesirableactionontheadvertiser'ssite,andsmart-pricingpenalizespublishersthatresultinpoorconversionrates.
Conversion-spamtakesadvantageofthefactthatsmart-pricing,whichreducestheclick-spammer'srevenue,reliesontheabsenceofconversionsignals,whichsimplyareHTTPrequestsini-tiatedfromtheuser'sbrowser(Figure1)thatmalwarecanmanipulate.
Conversion-spam.
Thissetsthestageforconversion-spamaspredictedin[29].
Aclick-spammerwhosendsclicks,butnotconversions(i.
e.
buyers),eventuallygetssmart-priced.
Ifsuchaclick-spammerweretosomehowtriggerconversion-signalsontheadvertiser'ssite,theadnetworkwouldbeledtobelievethatthetracisofgoodqualityandnotactivatethesmart-pricingdiscount,thusresultinginhigherprotsfortheclick-spammer.
Viceroiaggedseveralwebsiteseitherconrmedorarehighlylikelytobeengaginginconversion-spam(basedontheevidencewepresentbelow).
Infact,Viceroifoundthreedistinctapproachestocommittingconversion-spamamongthewebsitesweinvestigated7.
Twooftheseapproacheshadpreviouslynotbeenseenoperatinginthewild.
Wehavepresentedourinvestigationresultstomultipleadnetworks.
WhyhighROI:Conversion-spamdisproportionatelyin-creasestheROIofanygivenclick-spamapproach.
Thisisbecausethefraudulentconversion-signalsdeactivatethepublishersmart-pricingdiscountfornotjusttheadvertiserthatsueredfromconversion-spam,butratherforalladver-tiserswhoseadsshowuponthepublisher'swebsite.
Thus,asmallamountofconversion-spamcancauseasignicantboostinROIfortheclick-spammer.
Theingenuityoftheconversion-spamapproachesbelowsimplyunderscoresourinsightthatclick-spammer'swillmaximizetheirprotsinanywaytheycan.
Somethatwecatch:Provingconversion-spamishardbecauseadnetworksreceiveessentiallyasingle-bitconversion-signalfromtheadvertiserwithabsolutelynovisibilityintowhatthatbitmeans(i.
e.
,newslettersign-uporactualsale).
Advertiserstypicallydonothavesystemssophisticatedenoughtocatchconversion-spaminreal-time.
Weuseanoveltechniqueforattractingconversion-spam.
BuildingupontheBluadsapproachbyHaddadietal.
[10],wedesignwhatwecallBluforms.
Bluformsareformsonpageswithnonsensecontent,thatasktheuserfornonsenseinformation.
TheseformsaresetasthelandingpageforaBluadwhichisknowntoconcentrateclick-spamtrac.
Figure5showsascreenshotofourbluform—itaskstheuserfornonsensicalinformation:mobilepennumber,com-putereigenname,andeyelidemailonapagetitledComputerRepairviaMobileEnglishthatusersreachafterclickingtheoverattachedzurlitead(Figure4)—inotherwords,completenonsense.
734userssubmittedourbluformin26days.
WeheavilyinstrumentedourbluformusingJavaScripttogatheruseractivitytelemetryandloggedallHTTPtractotheserverthathostedthebluform.
Wethenmanuallyinvestigatedthepublishersthatsentustheseusers.
Weidentiedthreedistinctclassesofconversion-spam.
Viceroiaggedpublishersassociatedwiththedomainswereceivedbluformsubmissionsfrom.
Type1:Mostly-Automated(malwaredriven).
Wereceived315and107bluformsubmissionsfromtraccomingfromReeturn.
comandAectSearch.
comrespectively.
Later,inSection6.
4wendboththesepublishersusetheZeroAc-cessmalwareforclick-spam;theZeroAccessmalwarefamilyisknowntoembedabrowsercontrolthatallowsthemalwaretorunJavaScript.
Thetimespentonthebluformbybothsetsoftracisuniformlydistributedbetweenexactly60s–160s;itperfectlytstheline60+100xbetweenx=[0,1](withcorrelationcoecientr=0.
98forAectSearch.
comandr=0.
99forReeturn.
com),i.
e.
,themalwarewaitsex-actly60+random(100)seconds.
Afterthisdelaytheformis7Wealsodetectedafourthapproachthatwearecurrentlyintheprocessofcompilingconclusiveevidenceabout.
Figure5:Bluformweusedtocatchconversion-spam.
submittedwithoutenteringanyinput.
Weinfectedahoney-potwithaZeroAccessbinarywefoundonlineandobservedthebotfunnelingclicksonadsonbothAectSearch.
comandReeturn.
com.
Type2:Semi-automated(potentially,clickfarm).
Were-ceived10bluformsubmissionsfromafamilyofparkeddomainwebsiteslikeJJBargains.
com.
LookingamongthedomainsassociatedwiththispublisherthatViceroiagged,wenoticedthatonlyasmallsetofusersappear(repeatedly)tobeclickingonadsshownbythispublisher,andtheseusersdonotappeartoclickadsforanyotherpublishersinthedataset.
Interestingly,some(butnotall)userspresentamalformeduser-agentstring.
Alluserslledoutneatlyfor-mattedphonenumbersformobilepennumberandaneatlycapitalizedCaucasianfemalerstnamesforcomputereigenname;incontrast,mostothernon-emptysubmissionsonthebluform(whichweassumewerecurioususers)lledinarandomassortmentofcharacters.
Giventhesmallnumberofusers,clickingadsonasinglepublisher,llingformsinastandardizedbuthuman-likemanner,andpresentingmal-formeduser-agentstrings,wesuspectthispublisherisusingaclick-farmwithcustomsoftwarethatassistshumanclick-ersinperformingclick-spamandconversion-spam.
Type3:Massivelycrowd-sourced.
Weaggedsomedo-mainsassociatedwithanunnamedpublisher.
WedidnotreceiveBluformsubmissionsfromthispublisher;theadnetworkinformedusthattheyhadterminatedtheirrela-tionshipwiththeunnamedpublisherbeforeweconductedourBluformexperiment.
Thepublisherisalargeonlinegamblingsitethatoersusersfreevirtualchipsiftheyclickonadsand"llanyforms"onthelandingpage.
Remedy.
Bluformsarerelativelyeasytoavoid(onceclick-spammerswiseuptothem)andthusareofuseonlyintheshort-termandatsmallscalestosmokeoutsomein-stancesofconversion-spam.
Thefundamentalproblemstok-ingconversion-spam,however,isitsconnectiontosmart-pricingthatcreatesaneconomicincentiveforconversion-spam.
Webelievethebestwaytorootoutconversion-spamisforthesmart-pricingalgorithmtoconsideronlyconver-sionsignalsthatrequiretheusertoactuallymakeanon-trivialpurchaseontheadvertisersite(similartothepro-posalin[13])sinceitwouldcreateaneconomicburdenforFigure6:Buzzdockinjectingadsintoasearchresultpage.
Theoriginalsearchresultsarepusheddownandan(irrelevant)adoccupiesprimeon-screenreal-estateevenwhenthesearchenginechosetonotshowanyadsforthequery.
theclick-spammer.
Coordinatingsuchaschemeacrossad-vertisersis,however,likelytobechallenging.
6.
2AdInjectionWhat:Normallythepublisherwebsitecontrolswhere,howmany,andwhatadsareshownonthatwebsitebyinsertingiframesorusingJavaScript.
Anadinjectorisapartyunaliatedwiththepublisherwebsitethatmodiesthewebsiteasseenbytheuserbyeitherinsertingadswheretherewerenone,orbyreplacingtheadsaddedbythepub-lisherwithadstheadinjectorwantstoinsteadshow.
Thesemodicationscanbedonefromwithintheuser'sbrowser(iftheadinjectorisabrowserplugin),orcanbedonethroughin-networkelementsthatperformdeep-packet-inspection.
Totheadnetwork,anadinjectorappearsassimplyan-otherpublisher.
Anyclicksonadsinjectedbytheadinjec-torareaccountedtowardstheadinjector'spayout,andthelegitimatepublisherwhosewebsitewasmodiedmakesnomoneyfromtheadclick.
PhormandNebuAd(nowdefunct)weretwofor-protcompaniesthatcreatedin-networkadin-jectionmiddleboxes,deployedbysomeISPs,thatinjectedadsintowebsitesbelongingtonon-protorganizations[12].
Whilethesein-networkadinjectorslostthebattle(duetotheISPssueringaPRbacklash),thebattleseemstonowhavemovedintotheusers'browsers.
Whyclick-spam:Byshowingadsonapublishersitewhereauserexpectssomeothercontentheishighlylikelytoclick,adinjectorsconfuseusers(andadvertisersenduppay-ingforit).
Consider,forinstance,ausersearchingforacmmembershipwiththeexpectationthateithertherstsearchresultortherstadresult(chosenbyhispreferredsearchengine)willtakehimtohisintendeddestination.
BecauseoftheadinjectorBuzzdock,heispresentedthesearch-resultspageinFigure6insteadwhereprimeon-screenreal-estate—thepositionoftherstsearchresult—nowshowsanentirelyirrelevantad(evenwhentheoriginalsearchenginechosetonotshowanyadsforthisquery).
Iftheuserclickstherstbluelink,perhapsreexively,theadvertisermustpayforaspamclick.
Othersiteswherewe'vefoundBuz-zdockinjectingadsincludeAmazonandeBaysearchresults(wheretheadsareformattedtomatchthesitecontent,buttakeusersawayfromthesiteaftertheusersintentionallysearchedontheshoppingsite),aswellasinsearchresultsonYelp,YouTube,Wikipediaandotherhigh-tracsites.
WhyhighROI:AdinjectorshaveananomalouslyhighROIperuserbecauseforthetracacquisitioncostofin-stallingasinglebrowserplugin(25perinstall[15])theycaninjectadsintoprimeon-screenreal-estateacrosstheentireweb,andcollectmoneyfromallclicksintentionalornot.
Somethatwecatch:Viceroiaggedtracfromthefollowingadinjectors:Buzzdock.
Browserplugintypicallybundledwithfreewareoradwaresoftwarefoundonline(e.
g.
,PDFreaders);in-stalledbydefaultwiththehostsoftware8;andnotremovedwhenthehostsoftwareisuninstalled.
Adsareformattedtomatchthelook-and-feelofthesiteintowhichadsareinjected.
WajamandB00kmarksaretwoothersthatfollowidenticalbusinessmodelasBuzzdock.
Remedy.
Intheshort-term,adnetworksforwhomtheseadinjectorsarepublisherscanltertheirclicks(andcutotheirrevenue)iftheadinjectorsareinviolationofadnet-workpolicy.
Foradnetworkswhereadinjectorsarecompli-antwithpolicy,PRpressureoradvertiseroutragemayhelpconvincetheseadnetworkstochangepolicy(ashappenedwithISPsandin-networkadinjection).
Inthelong-term,legalprecedentmaycreateastrongdisincentiveforbusi-nessmodelsthatdeprivelegitimatepublishersofadvertisingrevenue.
TowardsthisendFacebookiscurrentlylitigatingagainstSambreelHoldings,thecompanybehindBuzzdockandPageRage,thelatterbeinganadinjectorthatinjectedadsintotheFacebooksite.
6.
3SearchHijackingWhat:Searchhijackingreferstosomepartyunexpect-edlyredirectingtheuser'ssearchqueriesawayfromtheirpreferredsearchenginetoapagefullofadsformattedtolooklikesearchresults.
Thesearchhijackerearnsrevenuefromeachadclick.
Thehijackingmaybeperformedthroughin-networkelements(e.
g.
,ISPDNSservers),in-browserele-ments(e.
g.
,pluginsandtoolbars),ordeceivingorconfusingtheuserintochangingtheirbrowsersearchsettings.
Whyclick-spam:Searchhijackinghijackssearchqueriesregardlessofwhetherthesearchqueryisnavigational(i.
e.
,queriesforaspecicsite,e.
g.
,youtube),informational(i.
e.
,broadquerieswithmultiplepotentialintents,e.
g.
,bayarea),ortransactional(i.
e.
,querieswithcommercialintent,e.
g.
,sanfranciscohotel).
Navigationalandinformationalqueries(estimatedtobe75%[26]ofsearchqueries)arehardtomonetize.
Advertisersrelyonthesearchenginetonotshowtheiradsforsuchqueries,andreputedsearchengineusetheopportunitytopresentamorepleasinguserexperiencebynotshowingadsforthesequeries.
Searchhijackers,ontheotherhand,bombardtheuserwithadsforthesequeriesandmakeadvertiserspayfortheresultingclicks.
Thatsaid,thisisagray-areasincetheuser(presumably)readtheadbeforedecidingtoclickonit(orsosearchhijackersargue).
Inpractice,searchhijackersmakethesituationsigni-cantlylessgraybyexplicitlyincreasingthelikelihoodthattheuserwillunintentionallyclickonads.
Notonlyaretheadstypicallyshownonawhitebackgroundmimickingor-ganicsearchresults(whiletheconventionistouseshadedbackgroundsforads),accidentalclicksanywhereinvastar-easofwhite-space(seeFigure7)resultinanadclick.
8Adinjectorstypicallyarguethatusersconsentedtoin-stallingit,however,anoverwhelmingfractionofuserswithadinjectorsareeitherentirelyunawareofthemorunawareofwhattheydo.
[7]Figure7:SearchhijackingbytheScourtoolbar.
Ads(indis-tinguishablefromsearchresults)areshownforqueriesinclud-ingnavigationalandinformationalqueries.
Accidentalclicksonwhite-spaceresultsinanadclick.
Forqueryyutoube,therstlink(anad)goestoaspywaredownload.
WhyhighROI:Searchhijackersgetasmuchtracasalegitimatesearchenginewould,butwherealegitimatesearchenginehavefarmoreorganicsearchclicksthanadclicks,searchhijackersextractpredominantlyadclicksfromthattrac.
Thusforthecostofacquiringasingleuser,searchhijackersreapordersofmagnitudemoreadclicksthanalegitimatesearchengine.
Somethatwecatch:Viceroiaggedtracfromthreedierentclassesofsearchhijacking,andmultiplepublishersineachclass:Type1:In-networkhijacking(ofDNSNXrecords).
ViceroiaggedtracfromatleasttwolargeUSISPs(RoadRunnerbyTimeWarnerCable,andCoxCommunication)wheretheDNSserversoperatedbytheISPsappeartohijackDNSNXresponses(i.
e.
,fornon-existentdomainnames)andredirectsthebrowsertoasearchhijackpagewiththenon-existentdo-mainasthesearchquery.
Thesequeriesare,bydenition,navigationalqueries.
Theresultspageisfullof(irrelevant)adsevenwhenthequeryisanobvioustypoforaspecicsite.
Type2:In-browserhijacking(viatoolbars).
Viceroiaggedtracfromanumberoftoolbarsthathijacksearchqueriesenteredinthebrowser'ssearchboxoraddressbar.
Thesein-cludeSmartAddressbar,BenetBar,CertiedToolbar,Search-Nutandmanyothers.
Theyareinstalledstealthily(bundledwithfreeware)andhardtoremove.
Thehijackedsearchre-sultscouldeasilybemistakenforaGooglesearchresultspageatrstglance,withupwardsoftenadsandfew,ifany,actualsearchresults.
SearchNutisuniqueinthatitcombinestheDNSNXbe-haviorabovewithin-browserhijacking.
Ifthedomaindoesnotexist,thetoolbarinterceptstheNX(inthebrowser)andredirectsthebrowsertoapageladenwithads.
Type3:Defaultsearchhijacking.
Viceroiaggedtraf-cfromsomesitesthatpresentapopup,whichiftheuserclicks,setsthesiteasthedefaultsearchenginefortheuser.
ThisincludesScour,Efacts,andClickShield.
Thesesitesalsooertochangetheuser'shomepagetotheirsearchen-gines.
Remedy.
Legitimatecompetitioninwebsearchisgood.
However,these"searchengines"appeartoexistforthesolepurposeofshowingadsandnotforinnovatinginwebsearch(indeedsomedon'tevenshoworganicresults).
Anyactionalargesearchadnetworkmighttakeagainstthemwouldlikelybeconstruedanactofstiingcompetition.
Adver-tisers(thepartieshurtmostbyhavingtheiradsbeshownfornavigationalandinformationalqueries)areinabetterpositiontoxtheproblem.
Oneapproachmaybeforad-vertiserstodemandtheabilitytoopt-outfromhavingtheiradsbeingshownbysearchhijackers.
6.
4Malware,Arbitrage,andParkedDomainsLastly,Viceroicaughtthreeadditionalclassesofclick-spamdrivenbymalware,arbitrage,andparkeddomains.
Thesethreeclassesofclick-spamwerepreviouslymentionedin[4]wheretheauthorsusedad-hoctechniquestondanex-ampleofeach.
Viceroinotonlydetectedthesethreeclassesinageneralmanner,itaggedtracfromatleastthreeseparateinstancesofeachofthesethreeclasses.
Malware.
Itiswell-knownthatsomeclick-spammersuseinfectedhoststoclickonadsontheirsite.
Theseclick-spammershaveahighROIbecausebotnetsarepracticallyacommodity.
TheauthorsdiscussthesuperstealthyTDL4botnetin[4].
WeaggedtraccomingnotonlyfromaTDL4botnet,butalsofromasecondbotnetcalledZeroAc-cess.
WeinfectedaVMwithaZeroAccessmalwarebinaryandfoundittobefarmoreaggressivethanTDL4inthatZeroAccessperformedmanyclicksadayascomparedtoTDL4'sstealthyone-click-per-day.
ZeroAccessverydeliber-atelystripeditsclicksacrossalargenumberofbigandsmalladnetwork,andacrossmanypublisherwebsites.
Wesus-pectwhereTDL4achievesstealthinessinthetimedomain,ZeroAccessdoesthesamebyspreadingtheload.
ZeroAc-cess,whichisnewerthanTDL4,apparentlyreusesmanyTDL4components[30].
ViceroiaggedclicksfrommanyofthepublisherwebsitesthatwenoticedourZeroAccessbotclickingon.
Thisin-cludes,asmentioned,AectSearchandReeturnwhichhavea36.
11%overlapinusers(stronglysuggestingthattheyusethesamebotnet).
RecallthatViceroiisbasedpurelyonROIdistributionsandisentirelyoblivioustouseroverlap;thisoverlapthusrepresentsadditionalvalidationthatViceroiiseective.
OtherwebsitesthatViceroiaggedthathavehighoverlapwithAectSearchincludeBuscarLatam9andFreeSearchBuddy(78.
87%and40%overlaprespectively).
Observethatbotnetsarebecomingacommodityserviceaswendlarge"serviceproviders"cateringtoabroadcus-tomerbase.
Thisisboundtodrive(bot)tracacquisitioncostdownstillfurther,increasingclick-spammerprots.
Inthenextsectionwesimulatesomestraw-manscenariosin-volvingmassivebotnetsandwhetherourapproachcanstillcatchthem.
Arbitrage.
Someclick-spammersacquire(cheap)traf-cbyrunningadsforlowpopularitykeywordsononeadnetwork,andthenshowingclickingusers(moreexpensive)adsfromadierentadnetwork[4].
Theseclick-spammersmanageahighROIbybuyinglow-costtracandsellinghigh-payoutads.
Adnetworkspenalizepublisherwebsitesthatshowtoomanyadsonthelandingpage.
Thispenaltyismanifestedasahighercost-per-clickfortheadvertiser9ASpanishlanguagesearchenginethatinitiallyfrustratedourinvestigationattemptsduetothelanguage-barrier.
Figure8:Arbitragebystarprices.
co.
uk.
Originalpagehasnoads.
Userseesadsinprimescreenrealestatewhencominginfromads,alongwithattractivegreenbuttons.
(inthiscase,highertracacquisitioncostfortheclick-spammer).
Theclick-spammersgetaroundthispenaltybycloakingtheirlandingpage—whentheadnetwork'scrawlerorreviewteamsvisitthepagetheclick-spammershowsapagewithoutads,butwhenauserclickstheiradsthepagenowshow(almostexclusively)justads.
Viceroiaggedclicksfromthestarprices.
co.
ukfamilyofwebsites(Figure8),andsavingcentral.
co.
ukfamilyofwebsites,whichweconrmedtobearbitrage.
ParkedDomains.
Lastly,parkeddomainhostingser-viceshavehighROIbecausetheyhaveminimaltracacqui-sitioncosts—domainsareregisteredbysomeoneelsebeforetheyareparkedwiththeprovider,thedomainsreceivetraf-cfromusersmis-typing(orclickingonlinkselsewhereonthewebtonow-defunctdomains),andtheprovidercanservedynamicallygeneratedadladenpagesforanarbitrarynum-berofdomainsfromasingleserver.
ViceroiaggedclickscomingfromalargenumberofparkeddomainshostedonSedo(alsocalledoutby[4]),Skenzo,andParked.
com.
7.
DISCUSSIONWhileViceroicatchesadiverserangeofexistingattacks,anaturalnextquestionishowclick-spammayevolve.
GiventheinsightinSection4thatclick-spammersmusthavehigherROIthanethicalpublishers,thecoreViceroiapproach(ofcomparingpublisherrevenueperuserdistributionsagainstabenchmarksetofethicalpublishers)webelievewillstillbesound,butthenerdetailslikethesensitivitytothe(at-presentauto-tuned)τthresholdmayincreaseasclick-spammersacceptlowerrevenuessotheycanplaywithinthemargins.
SybilPublishers.
ToavoiddetectionbyViceroi,onewaytoreduce(apparent)revenueperuserisforapublishertoappearasmultiplepublishers(Sybils)eachmakingafrac-tionoftheoriginalrevenue.
Indeedsuchattemptshavebeenreported[4].
IftheSybilssharethesamebank-accounttoreceiveadnetworkpayments,theycanbetriviallyrecom-bined.
Acquiringmultiplebank-accountstoreceivepay-mentsisahigh-overhead(andhigh-risk)task[16].
SybilUsers.
Anotherwaytoreduce(apparent)revenueperuserisfortheclick-spammertomakeeachuserhecon-trolsappearasmultipleusers.
Notethatthisapproachdoesnotapplytoclick-spammechanismswheretheclick-spammerdoesnothavetheabilitytorunarbitrarycodeFigure9:Impactofaclick-spammerethicallyacquiringtrac.
Viceroicatchestheclick-spammeraslongasmorethanhalfthetracisclick-spam.
fromtheuser'sdevice(e.
g.
,arbitrage,parkeddomains,andin-networksearchhijacking).
Evenwhentheclick-spammercanrunarbitrarycode,whetherhecansuccessfullyinatetheusercountdependsonhowtheadnetworkcountsusers.
CertainuseridentierslikeIPaddressesarehardtofake10.
Collusion.
Onewaytoplaywithinthemarginsisfortheclick-spammertocolludewithanethicalpublisher.
Theideaisfortheclick-spammertoaddethically-acquiredcovertraf-ctoavoiddetection.
Wesimulatesuchanarrangementbypairingaclick-spammerfromourdatasetwitharandomlychosenethicalpublisherfromthedatasetwithroughlythesamenumberofusers.
Weperformaparametersweepwheretheclick-spammerreplacesx%ofhisuserswithusersac-quiresfromthe(nowno-longer)ethicalpublisherwithxrangingfrom0%to100%.
Atx=0thesimulatedpublisherisidenticaltotheoriginalclick-spammerandViceroiagsit.
Atx=100thesimulatedpublisherisidenticaltotheoriginalethicalpublisher,andViceroidoesn'tagit.
Weareinterestinginlearningatwhatpointthetransitionoccurs.
Figure9showsthepositionofthesimulatedpublisherrelativetotheauto-tunedvalueofτ—positionstotheleftofτareaggedbyViceroiasclick-spam,andpositionstotherightofτarenot.
Wendthatasthesimulatedpublishergraduallyaddsmoreethicallyacquiredusers,hispositiondriftsclosertotheτthreshold,fallingrightontheboundarywhenthesimulatedpublisherhasaroughly50-50splitbetweenethicalclicksandclick-spam.
Asthefractionofethicalclicksstartsdominating,Viceroistopsaggingthepublisher.
Webelievethisbehaviorisdesirablefortheadnetworksinceitcreatesapositiveincentiveforclick-spammerstoreformtheirways.
Whereaclick-spammerwouldmakenorevenuefromclick-spamifheweretooperateintheshadedregion,ifheweretogrowhisusersinlinewithhowanethicalpublisheracquiresusers,hewouldexittheshadedregionandstartmakingmoney(ethically).
Overtimethe10NoteIPspoongisnotanoptionsinceallcommunica-tionbetweentheuserdeviceandtheadnetworkgoesoverHTTP,whichrequirestheuserdevicetobeabletoreceiveandrespondtoinboundTCPpacketsfromtheadnetwork.
Figure10:Impactofclick-spammergrowingbotnetsize.
Viceroicontinuestocatchtheclick-spammeruptotwoordersofmagnitudeincreaseinbotnetsize.
thresholdmaybemovedfarthertotherighttofurtherincentgoodbehavior.
BruteForce.
Anotherwaytoplaywithinthemarginsisfortheclick-spammertodramaticallyincreasethesizeofthebotnetwhilemakingthebotsclickless.
Thishastheoveralleectofincreasingxed-costswhileholdingrevenueconstant,inessencedecreasingtherevenueperuser,whichisnecessaryfortheclick-spammertoexittheshadedregioninFigure2toavoidgettingagged.
Todeterminehowmuchlargerabotnettheclick-spammerneeds,wesimulatebot-netsupto2ordersofmagnitudelargerthanclick-spammersaggedbyViceroi.
Weareinterestedinlearninghowmuchhead-roomispresentinViceroi'scurrentchoiceofthresholdτ.
Figure10showsthepositionofthesimulatedclick-spammerrelativetotheauto-tunedvalueofτ.
Theclick-spammer'scurrentbotnetsize(labeledas1x)iscomfortablyinthere-gionaggedbyViceroi.
Asweincreaseitbyanorderofmagnitude,thesimulatedclick-spammermovesclosertotheτthresholds.
Withtwoordersofmagnitudelargerabot-net,theclick-spammerisontheborderline.
BeyondthisViceroi'scurrentchoiceofτdoesnotagthespammer.
Notethatintheprocesstheclick-spammer'sxed-costsincreasescommensuratelybytwoordersofmagnitudewhileholdingrevenueconstant;i.
e.
,theclick-spammer'sprotsdropuptoby99%intheprocess.
Wecannotanswer,however,whetherclick-spamthroughbotnetswillremaineconomicallyviableevenafteratwoordersofmagnitudedropinprots.
Nev-ertheless,webelievethatthelearnedthresholdτhassu-cienthead-roomwhendealingwithsignicantlylargerbot-netsthantoday.
8.
SUMMARYInthispaperwepresentViceroi,ageneralapproachtocatchingclick-spam.
Itisdesignedaroundtheinvariantthatclick-spamisabusiness(forclick-spammers)thatneedstodeliverhighROItoosettheriskofgettingcaught.
Weeval-uateourapproachonalargereal-worldad-networkdatasetandndsixdierentclassesofclick-spamlinkedtocon-versionfraud,adinjection,searchhijacking,malware,arbi-trage,andparkeddomains.
Weadditionallyndevidenceofmanysub-classesofthesetypesincludingautomatedandsemi-automatedconversionfraud,hijackingthroughDNSinterception,andndmultiplepublishersbenetingfromeachofthesemodels.
TheViceroiapproachagsclick-spamthroughallthesemechanismswithoutanytuningknobs,hasgoodperformanceonROCandprecision-recallcurves,andisresilientagainstclick-spammersusinglargerbotnetsovertime.
Furthermore,ourapproachisrankedamongthebestexistingltersdeployedbythead-networktodaywhilebe-ingfarmoregeneral.
Weadditionallypresentthenovelbluformtechniqueforcatchingconversionfraud.
AcknowledgementsWe'dliketothanktheanonymousreviewersandourshep-herd,VyasSekarfortheircomments.
We'dalsoliketoac-knowledgeGeoVoelkerforhisfeedbackonthepaper.
Thepaperismuchimprovedbecauseoftheirinputsandsugges-tions.
Additionally,wearegreatlyindebtedtoJigarMody,DennisMinium,ShivaNagabhushanswamy,TommyBlizardandNikolaLivic,withoutwhosehelpandinputs,thisworkwouldnothavebeenpossible.
9.
REFERENCES[1]Alrwais,S.
A.
,Gerber,A.
,Dunn,C.
W.
,Spatscheck,O.
,Gupta,M.
,andOsterweil,E.
DissectingGhostClicks:AdFraudViaMisdirectedHumanClicks.
InProceedingsofthe28thAnnualComputerSecurityApplicationsConference(ACSAC)(Orlando,FL,2012),pp.
21–30.
[2]Blizard,T.
,andLivic,N.
Click-fraudmonetizingmalware:Asurveyandcasestudy.
InProceedingsofthe7thInternationalConferenceonMaliciousandUnwantedSoftware(MALWARE)(Fajardo,PR,Oct.
2012),pp.
67–72.
[3]Caballero,J.
,Grier,C.
,Kreibich,C.
,andPaxson,V.
MeasuringPay-per-Install:TheCommoditizationofMalwareDistribution.
InProceedingsofthe20thUSENIXSecuritySymposium(SanFrancisco,CA,Aug.
2011).
[4]Dave,V.
,Guha,S.
,andZhang,Y.
MeasuringandFingerprintingClick-SpaminAdNetworks.
InProceedingsoftheAnnualConferenceoftheACMSpecialInterestGrouponDataCommunication(SIGCOMM)(Helsinki,Finland,Aug.
2012),pp.
175–186.
[5]FBI.
OperationGhostClick:InternationalCyberRingThatInfectedMillionsofComputersDismantled.
FederalBureauofInvestigationPressReleases(Sept.
2011).
http://1.
usa.
gov/12c8Vhr.
[6]GoogleInc.
Aboutsmartpricing.
AdWordsHelp(Apr.
2013).
http://bit.
ly/XObpxY.
[7]GoogleInc.
buzzdock.
GoogleSearch(May2013).
http://bit.
ly/17MoGPq.
[8]GoogleInc.
HowGoogleusesconversiondata.
AdWordsHelp(Mar.
2013).
http://bit.
ly/YJHUnF.
[9]GoogleInc.
PaymentOptionsandMinimumPaymentAmounts.
GoogleAdWords(May2013).
http://bit.
ly/XZhRmH.
[10]Haddadi,H.
FightingOnlineClick-FraudUsingBluAds.
ComputerCommunicationReview(CCR)40,2(Apr.
2010),21–25.
[11]Ipeirotis,P.
Uncoveringanadvertisingfraudscheme.
Or"theInternetisforporn".
Blog:AComputerScientistinaBusinessSchool(Mar.
2011).
http://bit.
ly/LqYyTs.
[12]Jesdanun,A.
AdTargetingBasedonISPTrackingNowinDoubt.
AssociatedPress(Sept.
2008).
[13]Juels,A.
,Stamm,S.
,andJakobsson,M.
CombatingClickFraudviaPremiumClicks.
InProceedingsofthe16thUSENIXSecuritySymposium(Boston,MA,Aug.
2007),pp.
1–10.
[14]Kanich,C.
,Kreibich,C.
,Levchenko,K.
,Enright,B.
,Voelker,G.
M.
,Paxson,V.
,andSavage,S.
Spamalytics:AnEmpiricalAnalysisofSpamMarketingConversion.
InProceedingsofthe15thACMConferenceonComputerandCommunicationsSecurity(CCS)(Alexandria,VA,Oct.
2008),pp.
3–14.
[15]Lattin,P.
CostPerDownloadorCostPerInstallMarketing.
PerformanceMarketingInsider(Sept.
2011).
http://bit.
ly/Xdq85I.
[16]Levchenko,K.
,Pitsillidis,A.
,Chachra,N.
,Enright,B.
,Felegyhazi,M.
,Grier,C.
,Halvorson,T.
,Kanich,C.
,Kreibich,C.
,Liu,H.
,McCoy,D.
,Weaver,N.
,Paxson,V.
,Voelker,G.
M.
,andSavage,S.
ClickTrajectories:End-to-EndAnalysisoftheSpamValueChain.
InProceedingsofthe32ndIEEESymposiumonSecurityandPrivacy(Oakland)(Oakland,CA,May2011),pp.
431–446.
[17]McCoy,D.
,Pitsillidis,A.
,Jordan,G.
,Weaver,N.
,Kreibich,C.
,Krebs,B.
,Voelker,G.
M.
,Savage,S.
,andLevchenko,K.
PharmaLeaks:UnderstandingtheBusinessofOnlinePharmaceuticalAliatePrograms.
InProceedingsofthe21stUSENIXSecuritySymposium(Bellevue,WA,Aug.
2012).
[18]Metwally,A.
,Agrawal,D.
,andElAbbadi,A.
DETECTIVES:DETEctingCoalitionhiTInationattacksinadVertisingnEtworksStreams.
InProceedingsofthe16thInternationalWorldWideWebConference(WWW)(Ban,Canada,May2007),pp.
241–250.
[19]Metwally,A.
,Emekci,F.
,Agrawal,D.
,andElAbbadi,A.
SLEUTH:Single-pubLisherattackdEtectionUsingcorrelaTionHunting.
ProceedingsoftheVLDBEndowment(PVLDB)1,2(Aug.
2008),1217–1228.
[20]Miller,B.
,Pearce,P.
,Grier,C.
,Kreibich,C.
,andPaxson,V.
What'sClickingWhatTechniquesandInnovationsofToday'sClickbots.
InProceedingsoftheConferenceonDetectionofIntrusionsandMalware&VulnerabilityAssessment(DIMVA)(Amsterdam,Netherlands,July2011),pp.
164–183.
[21]Moore,T.
,Leontiadis,N.
,andChristin,N.
FashionCrimes:Trending-TermExploitationontheWeb.
InProceedingsofthe18thACMConferenceonComputerandCommunicationsSecurity(CCS)(Chicago,IL,Oct.
2011),pp.
455–466.
[22]Ollmann,G.
Wanttorentan80-120kDDoSBotnetBlog:Damballa(Aug.
2009).
http://bit.
ly/W9Hh2x.
[23]PandaLabs.
PandaLabsSecurityReport.
PandaSecurityPressCenter(Apr.
2011).
http://bit.
ly/150bmHw.
[24]Parker,P.
IAB&PwC:SearchStillTopsOnlineAdRevenues,AndShareGrewIn2011.
Blog:SearchEngineLand(Apr.
2012).
http://selnd.
com/12WlgoH.
[25]Roesner,F.
,Kohno,T.
,Moshchuk,A.
,Parno,B.
,Wang,H.
J.
,andCowan,C.
User-DrivenAccessControl:RethinkingPermissionGrantinginModernOperatingSystems.
InProceedingsofthe33rdIEEESymposiumonSecurityandPrivacy(Oakland)(SanFrancisco,CA,May2012),pp.
224–238.
[26]Rose,D.
E.
,andLevinson,D.
UnderstandingUserGoalsinWebSearch.
InProceedingsofthe13thInternationalWorldWideWebConference(WWW)(NewYork,NY,May2004),pp.
13–19.
[27]Sinclair,L.
Clickfraudrampantinonlineads,saysBing.
TheAustralian(May2011).
http://bit.
ly/LqYval.
[28]Springborn,K.
,andBarford,P.
ImpressionFraudinOnlineAdvertisingviaPay-Per-ViewNetworks.
InProceedingsofthe22ndUSENIXSecuritySymposium(Washington,DC,Aug.
2013).
[29]Tuzhilin,A.
TheLane'sGiftv.
GoogleReport.
GoogleOcialBlog(July2006).
http://bit.
ly/13ABxSZ.
[30]Wyke,J.
SophosTechnicalPaper:ZeroAccessBotnet–MiningandFraudforMassiveFinancialGain.
SophosLabs(Sept.
2012).
http://bit.
ly/12ftRai.
[31]Yu,F.
,Xie,Y.
,andKe,Q.
SBotMiner:largescalesearchbotdetection.
InProceedingsoftheACMInternationalConferenceonWebSearchandDataMining(WSDM)(NewYorkCity,NY,Feb.
2010),pp.
421–430.
[32]Zhang,Q.
,Ristenpart,T.
,Savage,S.
,andVoelker,G.
M.
GotTracAnEvaluationofClickTracProviders.
InProceedingsofJointWICOW/AIRWebWorkshoponWebQuality(Hyderabad,India,Mar.
2011),pp.
19–26.

RAKsmart 黑色星期五云服务器七折优惠 站群服务器首月半价

一年一度的黑色星期五和网络星期一活动陆续到来,看到各大服务商都有发布促销活动。同时RAKsmart商家我们也是比较熟悉的,这次是继双十一活动之后的促销活动。在活动产品中基本上沿袭双11的活动策略,比如有提供云服务器七折优惠,站群服务器首月半价、还有新人赠送红包等活动。如果我们有需要RAKsmart商家VPS、云服务器、独立服务器等产品的可以看看他们家的活动。这次活动截止到11月30日。第一、限时限...

3G流量免费高防CDN 50-200G防御

简介酷盾安全怎么样?酷盾安全,隶属于云南酷番云计算有限公司,主要提供高防CDN服务,高防服务器等,分为中国境内CDN,和境外CDN和二个产品,均支持SSL。目前CDN处于内测阶段,目前是免费的,套餐包0.01一个。3G流量(高防CDN)用完了继续续费或者购买升级包即可。有兴趣的可以看看,需要实名的。官方网站: :点击进入官网云南酷番云计算有限公司优惠方案流量3G,用完了不够再次购买或者升级套餐流量...

HostWebis:美国/法国便宜服务器,100Mbps不限流量,高配置大硬盘,$44/月起

hostwebis怎么样?hostwebis昨天在webhosting发布了几款美国高配置大硬盘机器,但报价需要联系客服。看了下该商家的其它产品,发现几款美国服务器、法国服务器还比较实惠,100Mbps不限流量,高配置大硬盘,$44/月起,有兴趣的可以关注一下。HostWebis是一家国外主机品牌,官网宣称1998年就成立了,根据目标市场的不同,以不同品牌名称提供网络托管服务。2003年,通过与W...

googleadsense为你推荐
支持ipad勒索病毒win7补丁由于电脑没连接网络,所以成功躲过了勒索病毒,但最近要联网,要提前装什么补丁吗?我电脑断网好久了itunes备份怎么使用iTunes备份用itunes备份iphone怎么从itunes备份恢复phpechophp echo函数 是什么意思micromediamacromedia的中文名fastreport2.5空调滤芯pm2.5是什么意思?chrome18CHROME现在最新版是多少?chrome17谁能告诉我现在的Chrome17.0.9和Chrome稳定版有什么不同?安卓4.4.4微信旧版本安卓4.4.4可用
香港加速器 diahosting z.com info域名 轻量 长沙服务器 灵动鬼影 腾讯云分析 刀片服务器的优势 169邮箱 泉州移动 亚马逊香港官网 isp服务商 电信虚拟主机 卡巴斯基免费试用 跟踪路由命令 空间登录首页 工信部网站备案查询 photobucket 七牛云存储 更多