renderhttp

http://www.4399.com/  时间:2021-03-20  阅读:()
ClickFraudDetectionontheAdvertiserSideHaitaoXu1,DaipingLiu1,AaronKoehl1,HainingWang1,andAngelosStavrou21CollegeofWilliamandMary,Williamsburg,VA23187,USA{hxu,liudptl,amkoeh,hnw}@cs.
wm.
edu2GeorgeMasonUniversity,Fairfax,VA22030,USAastavrou@gmu.
eduAbstract.
Clickfraud—maliciousclicksattheexpenseofpay-per-clickadvertisers—isposingaseriousthreattotheInterneteconomy.
Althoughclickfraudhasattractedmuchattentionfromthesecuritycommunity,asthedirectvictimsofclickfraud,advertisersstilllackeectivedefensetodetectclickfraudindependently.
Inthispaper,weproposeanovelap-proachforadvertiserstodetectclickfraudsandevaluatethereturnoninvestment(ROI)oftheiradcampaignswithoutthehelpsfromadnet-worksorpublishers.
Ourkeyideaistoproactivelytestifvisitingclientsarefull-edgedmodernbrowsersandpassivelyscrutinizeuserengage-ment.
Inparticular,weintroduceanewfunctionalitytestanddevelopanextensivecharacterizationofuserengagement.
Ourdetectioncansig-nicantlyraisethebarforcommittingclickfraudandistransparenttousers.
Moreover,ourapproachrequireslittleeorttobedeployedattheadvertiserside.
Tovalidatetheeectivenessofourapproach,weimple-mentaprototypeanddeployitonalargeproductionwebsite;andthenwerun10-dayadcampaignsforthewebsiteonamajoradnetwork.
Theexperimentalresultsshowthatourproposeddefenseiseectiveinidentifyingbothclickbotsandhumanclickers,whileincurringnegligibleoverheadatboththeserverandclientsides.
Keywords:ClickFraud,OnlineAdvertising,FeatureDetection.
1IntroductionInanonlineadvertisingmarket,advertiserspayadnetworksforeachclickontheirads,andadnetworksinturnpaypublishersashareoftherevenue.
Asonlineadvertisinghasevolvedintoamulti-billiondollarbusiness[1],clickfraudhasbecomeaseriousandpervasiveproblem.
Forexample,thebotnet"Chameleon"infectedover120,000hostmachinesintheU.
S.
andsiphoned$6millionpermonthfromadvertisers[2].
ClickfraudoccurswhenmiscreantsmakeHTTPrequestsfordestinationURLsfoundindeployedads[3].
SuchHTTPrequestsissuedwithmaliciousintentarecalledfraudulentclicks.
Theincentiveforfraudstersistoincreasetheirownprotsattheexpenseofotherparties.
Typicallyafraudsterisapub-lisheroranadvertiser.
PublishersmayputexcessiveadbannersontheirpagesM.
KutylowskiandJ.
Vaidya(Eds.
):ESORICS2014,PartII,LNCS8713,pp.
419–438,2014.
cSpringerInternationalPublishingSwitzerland2014420H.
Xuetal.
andthenforgeclicksonadstoreceivemorerevenue.
Unscrupulousadvertis-ersmakeextensiveclicksonacompetitor'sadswiththeintentionofdepletingthevictim'sadvertisingbudget.
Clickfraudismainlyconductedbyleveragingclickbots,hiringhumanclickers,ortrickingusersintoclickingads[4].
Inanactofclickfraud,bothanadnetworkandapublisherarebeneciarieswhileanadvertiseristheonlyvictim,underthepay-per-clickmodel.
Althoughtheadnetworkpaysouttothepublisherforthoseundetectedclickfraudac-tivities,itchargestheadvertisermorefees.
Thus,theadnetworkstillbenetsfromclickfraud.
Onlytheadvertiserisvictimizedbypayingforthosefraudu-lentclicks.
Therefore,advertisershavethestrongestincentivetocounteractclickfraud.
Inthispaper,wefocusonclickfrauddetectionfromtheperspectiveofadvertisers.
Clickfrauddetectionisnottrivial.
Clickfraudschemeshavebeencontinuouslyevolvinginrecentyears[3–7].
Existingdetectionsolutionsattempttoidentifyclickfraudactivitiesfromdierentperspectives,buteachhasitsownlimitations.
Thesolutionsproposedin[8–10]performtracanalysisonanadnetwork'straclogstodetectpublisherinationfraud.
However,anadvancedclickbotcanconductalow-noiseattack,whichmakesthoseabnormal-behavior-baseddetectionmechanismslesseective.
Haddadi[11]proposedexploitingbaitadstoblacklistmaliciouspublishersbasedonapredenedthreshold.
Motivatedby[11],Daveetal.
[4]proposedanapproachforadvertiserstomeasureclick-spamratiosontheiradsbycreatingbaitads.
However,runningbaitadsincreasesadvertisers'budgetonadvertisements.
Inthispaper,weproposeanovelapproachforanadvertisertoindependentlydetectclickfraudattacksconductedbyclickbotsandhumanclickers.
Ourap-proachenablesadvertiserstoevaluatethereturnoninvestment(ROI)oftheiradcampaignsbyclassifyingeachincomingclicktracasfraudulent,casual,orvalid.
Therationalebehindourdesignliesintwoobservedinvariantsoflegiti-mateclicks.
Therstinvariantisthatalegitimateclickshouldbeinitiatedbyarealhumanuseronarealbrowser.
Thatis,aclientshouldbearealfull-edgedbrowserratherthanabot,andhenceitshouldsupportJavaScript,DOM,CSS,andotherwebstandardsthatarewidelyfollowedbymodernbrowsers.
Thesec-ondinvariantisthatalegitimateadclickerinterestedinadvertisedproductsmusthavesomelevelofuserengagementinbrowsingtheadvertisedwebsite.
Basedonthedesignprinciplesabove,wedevelopaclickfrauddetectionsys-temmainlycomposedoftwocomponents:(1)aproactivefunctionalitytestand(2)apassiveexaminationofbrowsingbehavior.
Thefunctionalitytestactuallychallengesaclientforitsauthenticity(abrowserorabot)withtheassumptionthatmostclickbotshavelimitedfunctionalitycomparedtomodernbrowsersandthuswouldfailthistest.
Specically,aclient'sfunctionalityisvalidatedagainstwebstandardswidelysupportedbymodernbrowsers.
Failingthetestwouldin-duceallclicksgeneratedbytheclienttobelabelledasfraudulent.
Thesecondcomponentpassivelyexamineseachuser'sbrowsingbehaviorsontheadvertisedwebsite.
Itsobjectiveistoidentifyhumanclickersandthosemoreadvancedclickbotsthatmaypassthefunctionalitytest.
IfaclientpassesthefunctionalityClickFraudDetectionontheAdvertiserSide421testandalsoshowsenoughbrowsingengagementontheadvertisedwebsite,thecorrespondingclickislabelledasvalid.
Otherwise,aclickislabelledascasualifthecorrespondingclientpassesthefunctionalitytestbutshowsinsucientbrowsingbehaviors.
Acasualclickcouldbegeneratedbyahumanclickerorbyanunintentionaluser.
Wehavenoattempttodistinguishthesetwosinceneitherofthemisapotentialcustomerfromthestandpointofadvertisers.
Toevaluatetheeectivenessoftheproposeddetectionsystem,webuildaprototypeanddeployitonalargeproductionwebserver.
Thenwerunadcampaignsatonemajoradnetworkfor10days.
Theexperimentalresultsshowthatourapproachcandetectmuchmorefraudulentclicksthantheadnetwork'sin-housedetectionsystemandachievelowfalsepositiveandnegativerates.
Wealsomeasuretheperformanceoverheadofourdetectionsystemontheclientandserversides.
Notethatourdetectionmechanismcansignicantlyraisethebarforcommit-tingclickfraudandispotentiallyeectiveinthelongrunafterpublicdisclosure.
Toevadeourdetectionmechanism,clickbotsmustimplementallthemainwebstandardswidelysupportedbymodernbrowsers.
Andaheavy-weightclickbotwillriskitselfofbeingreadilynoticeablebyitshost.
Likewise,humanclickersmustbehavelikerealinterestedusersbyspendingmoretime,browsingmorepages,andclickingmorelinksontheadvertisedsites,whichcontradictstheiroriginalintentionsofearningmoremoneybyclickingonadsasquicklyaspos-sible.
Ateachpoint,theneteectisadisincentivetocommitclickfraud.
Theremainderofthepaperisorganizedasfollows.
WeprovidebackgroundknowledgeinSection2.
Then,wedetailourapproachinSection3andvalidateitsecacyusingreal-worlddatainSection4.
WediscussthelimitationsofourworkinSection5andsurveyrelatedworkinSection6.
Finally,weconcludethepaperinSection7.
2BackgroundBasedonourunderstandingofthecurrentstateoftheartinclickfraud,werstcharacterizeclickbotsandhumanclickers,thetwomainactorsleveragedtocommitclickfraud.
Wethendiscusstheadvertiser'sroleininhibitingclickfraud.
Finally,wedescribethewebstandardswidelysupportedbymodernbrowsers,aswellasfeaturedetectiontechniques.
2.
1ClickbotsAclickbotbehaveslikeabrowserbutusuallyhasrelativelylimitedfunctionalitycomparedtothelatter.
Forinstance,aclickbotmaynotbeabletoparseallelementsofHTMLwebpagesorexecuteJavaScriptandCSSscripts.
Thus,atthepresenttime,aclickbotisinstantiatedasmalwareimplantedinavictim'scomputer.
Evenassumingasophisticatedclickbotequippedwithcapabilitiesclosetoarealbrowser,itsactualbrowsingbehaviorwhenconnectedtotheadvertisedwebsitewouldstillbedierentfromthatofarealuser.
Thisisbecause422H.
Xuetal.
Instructions1.
Targetwebsite2.
#clickstoperform3.
Referrertouse4.
Tokenpatternsforad.
1.
Requestwebpage2.
Replywebpage3.
Requestads4.
ReplyAds5.
Pickanadtoclick6.
RedirectPublisherAdNetworkAdvertiserC&CserverBotmaster8.
Replylandingpage7.
RedirectFig.
1.
Howaclickbotworksclickbotsareautomatedprogramsandarenotsophisticatedenoughtoseeandthinkashumanusers,andasofyet,donotbehaveashumanusers.
AtypicalclickbotperformssomecommonfunctionsincludinginitiatingHTTPrequeststoawebserver,followingredirections,andretrievingcontentsfromawebserver.
However,itdoesnothavetheabilitytocommitclickfrauditselfbutinsteadactsasarelaybasedoninstructionsfromaremotebotmastertocom-pleteclickfraud.
Abotmastercanorchestratemillionsofclickbotstoperformautomaticandlarge-scaleclickfraudattacks.
Figure1illustrateshowavictimhostconductsclickfraudunderthecom-mandofabotmaster.
First,thebotmasterdistributesmalwaretothevictimhostbyexploitingthehost'ssecurityvulnerabilities,byluringthevictimintoadrive-bydownloadorrunningaTrojanhorseprogram.
Oncecompromised,thevictimhostbecomesabotandreceivesinstructionsfromacommand-and-control(C&C)servercontrolledbythebotmaster.
Suchinstructionsmayspecifythetargetwebsite,thenumberofclickstoperformonthewebsite,thereferrertobeusedinthefabricatedHTTPrequests,whatkindofadstoclickon,andwhenorhowoftentoclick[3].
Afterreceivinginstructions,theclickbotbeginstraversingthedesignatedpub-lisherwebsite.
ItissuesanHTTPrequesttothewebsite(step1).
Thewebsitereturnstherequestedpageaswellasallembeddedadtagsonthepage(step2).
AnadtagisasnippetofHTMLorJavaScriptcoderepresentinganad,usuallyinaniframe.
Foreachadtag,theclickbotgeneratesanHTTPrequesttotheadnetworktoretrieveadcontentsjustlikearealbrowser(step3).
Theadnetworkreturnsadstotheclickbot(step4).
Fromallofthereturnedads,theclickbotClickFraudDetectionontheAdvertiserSide423selectsanadmatchingthespeciedsearchpatternandsimulatesaclickonthead,whichtriggersanotherHTTPrequesttotheadnetwork(step5).
Theadnetworklogstheclicktracforthepurposeofbillingtheadvertiserandpayingthepublisherashare,andthenreturnsanHTTP302redirectresponse(step6).
Theclickbotfollowstheredirectionpath(possiblyinvolvingmultipleparties)andnallyloadstheadvertisedwebsite(step7).
Theadvertiserreturnsbackthelandingpage1totheclickbot(step8).
Atthispoint,theclickbotcompletesasingleactofclickfraud.
Everytimeanadis"clicked"byaclickbot,theadver-tiserpaystheadnetworkandtheinvolvedpublisherreceivesremunerationfromtheadnetwork.
Notethataclickbotoftenworksinthebackgroundtoavoidraisingsuspicion,thusallHTTPrequestsinFigure1aregeneratedwithoutthevictim'sawareness.
2.
2HumanClickersHumanclickersarethepeoplewhoarehiredtoclickonthedesignatedadsandgetpaidinreturn.
Humanclickershavenancialincentivestoclickonadsasquicklyaspossible,whichdistinguishesthemfromrealuserswhoaretrulyinterestedintheadvertisedproducts.
Forinstance,arealusertendstoread,consider,think,andsurfthewebsiteinordertolearnmoreaboutaproductbeforepurchase.
Apaidclickerhasfewsuchinterests,andhencetendstogetboredquicklyandspendslittletimeonthesite[12].
2.
3AdvertisersAdvertisersareinavantagepointtoobserveandfurtherdetectallfraudulentactivitiescommittedbyclickbotsandhumanclickers.
Tocompleteclickfraud,allfraudulentHTTPrequestsmustbenallyredirectedtotheadvertisedwebsite,nomatterhowmanyintermediateredirectionsandpartiesareinvolvedalongtheway.
Thisfactindicatesthatbothclickbotsandhumanclickersmustnallycommunicatewiththevictimadvertiser.
Thus,advertisershavetheadvantageofdetectingclickbotsandhumanclickersinthecourseofcommunication.
Inad-dition,astherevenuesourceofonlineadvertising,advertisershavethestrongestmotivationtocounteractclickfraud.
2.
4WebStandardsandFeatureDetectionTechniquesThemainfunctionalityofabrowseristoretrieveremoteresources(HTML,style,andmedia)fromwebserversandpresentthoseresourcesbacktoauser[13].
TocorrectlyparseandrendertheretrievedHTMLdocument,abrowsershouldbecompliantwithHTML,CSS,DOM,andJavaScriptstandardswhicharerep-resentedbyscriptableobjects.
Eachobjectisattachedwithfeaturesincludingproperties,methods,andevents.
Forinstance,thefeaturesattachedtotheDOMobjectincludecreateAttribute,getElementsByTagName,title,domain,url,and1Landingpageisasinglewebpagethatappearsinresponsetoclickingonanad.
424H.
Xuetal.
JavaScriptSupport&MouseEventTestFunctionalityTestBrowserBehaviorExamination#totalclicks#totalmousemoves#pagesviewedvisitduration…BehavioralclassificationFailPassFailPassFraudulentFraudulentValid/CasualSupportedbyAdclickMouseEventsclick,doubleclick,mouseup,mousedown,mousemove,mouseover,…ClickFraudDetectionMethodologyHTMLStandardDOMStandardCSSStandardJavaScriptStandardFig.
2.
Outlineofclickfrauddetectionmechanismmanyothers.
Everymodernbrowsersupportsthosefeatures.
However,dierentbrowservendors(anddierentversions)varyinsupportlevelsforthosewebstandards,ortheyimplementproprietaryextensionsalltheirown.
Toensurethatwebsitesaredisplayedproperlyinallmainstreambrowsers,webdevelop-ersusuallyuseacommontechniquecalledfeaturedetectiontohelpproduceJavaScriptcodewithcross-browsercompatibility.
Featuredetectionisatechniquethatidentieswhetherafeatureorcapa-bilityissupportedbyabrowser'sparticularenvironment.
Oneofthecommontechniquesusedisreection.
Ifthebrowserdoesnotsupportaparticularfea-ture,JavaScriptenginesreturnnullwhenreferencingthefeature;otherwise,JavaScriptreturnsanon-nullstring.
Forinstance,iftheJavaScriptstatement"document.
createElement"returnsnullinaspecicbrowser,itindicatesthatthebrowserdoesnotsupportthemethodcreateElementattachedtothedocumentobject.
Likewise,bytestingabrowseragainstalargenumberoffundamentalfeaturesspeciedinwebstandardsformodernbrowsers,wecanestimatethebrowser'ssupportlevelforthosewebstandards,whichhelpsvalidatetheau-thenticityoftheexecutionenvironmentasarealbrowser.
Featuredetectiontechniqueshavethreeprimaryadvantages.
First,featuredetectioncanbeaneectivemechanismtodetectclickbots.
Aclickbotcannot"pass"thefeaturedetectionunlessithasimplementedthemainfunctionalityofarealbrowser.
Second,featuredetectionstressestheclient'sfunctionalitythoroughly,andevenalargepooloffeaturescanbeusedforfeaturedetectioninafastandecientmanner.
Lastly,themethodsusedforfeaturedetectionaredesignedtoworkacrossdierentbrowsersandwillcontinuetoworkovertimeasnewbrowsersappear,becausenewbrowsersfundamentallysupportreection—evenbeforeimplementingotherfeatures—andshouldalsoextend,ratherthanreplace,existingwebstandards.
3MethodologyOurapproachmainlychallengesavisitingclientanditsuserengagementontheadvertisedsitetodeterminewhetherthecorrespondingadclickisvalidornot.
Tomaximizedetectionaccuracy,wealsocheckthelegitimacyoftheorigin(client'sIPaddress)andtheintermediatepath(i.
e.
,thepublisher)ofaclick.
ClickFraudDetectionontheAdvertiserSide425Figure2providesanoutlineofourapproach.
Ourdetectionsystemconsistsofthreecomponents:(1)JavaScriptsupportandmouseeventtest,(2)browserfunctionalitytest,and(3)browsingbehaviorexamination.
Foreachincominguser,onthelandingpage,wetestiftheclientsupportsJavaScriptandifanymouseeventsaretriggered.
NoJavaScriptsupportornomouseeventindicatesthattheclientmaynotbearealbrowserbutaclick-bot.
Otherwise,wefurtherchallengetheclient'sfunctionalityagainstthewebstandardswidelysupportedbymainstreambrowsers.
Theclientfailedthefunc-tionalitytestislabelledasaclickbot.
Otherwise,wefurtherexaminetheclient'sbrowsingbehaviorontheadvertiser'swebsiteandtrainabehavior-basedclassi-ertodistinguishareallyinteresteduserfromacasualone.
3.
1JavaScriptSupportandMouseEventTestOnesimplewaytodetectclickbotsistotestwhetheraclientsupportsJavaScriptornot.
Thisisduetothefactthatatleast98%ofwebbrowsershaveJavaScriptenabled[14]andonlineadvertisingservicesusuallycountonJavaScriptsupport.
Monitoringmouseeventsisanothereectivewaytodetectclickbots.
Ingen-eral,ahumanuserwithanon-mobileplatform(laptop/desktop)mustgenerateatleastonemouseeventwhenbrowsingawebsite.
Alackofmouseeventsagsthevisitingclientasaclickbot.
However,thismaynotbetrueforusersfrommobileplatforms(smartphones/pads).
Thus,weonlyapplythemouseeventtesttousersfromnon-mobileplatforms.
Table1.
Testedbrowsers,versionsandreleasedatesChrome(10)1.
0.
1542.
0.
1734.
0.
2235.
0.
307.
18.
0.
552.
2154/24/20096/23/200910/24/20091/30/201012/2/201012.
0.
742.
10016.
0.
912.
6320.
0.
1132.
4724.
0.
1312.
5727.
0.
1453.
946/14/201112/7/20116/28/20121/30/20135/24/2013Firefox(10)2.
03.
03.
53.
64.
010/24/20066/17/20086/30/20091/21/20103/22/20117.
011.
015.
019.
0.
220.
0.
19/27/20113/13/20128/28/20123/7/20134/11/2013IE(5)6.
07.
08.
09.
010.
08/27/200110/18/20063/19/20093/14/201110/26/2012Safari(10)3.
13.
23.
2.
24.
04.
0.
53/18/200811/14/20082/15/20096/18/20093/11/20105.
0.
15.
0.
35.
15.
1.
25.
1.
77/28/201011/18/20107/20/201111/30/20115/9/2012Opera(10)8.
509.
109.
209.
5010.
009/20/200512/18/20064/11/20076/12/20089/1/200910.
5011.
0011.
5012.
0012.
153/2/201012/16/20106/28/20116/14/20124/4/20133.
2FunctionalityTestAclientpassingtheJavaScriptandmouseeventtestisrequiredtofurtherun-dergoafeature-detectionbasedfunctionalitytest.
426H.
Xuetal.
Table2.
AuthenticfeaturesetwidelysupportedbymodernbrowsersObjectsFeaturesBrowserWindow(51)closed,defaultStatus,document,frames,history,alert,blur,clearInterval,clearTimeout,close,conrm,focus,moveBy,moveTo,open,print,prompt,resizeBy,resizeTo,scroll,scrollBy,scrollTo,setInterval,setTimeout,appCodeName,appName,appVersion,cook-ieEnabled,platform,userAgent,javaEnabled,availHeight,vailWidth,colorDepth,height,width,length,back,forward,go,hash,host,hostname,href,pathname,port,protocol,search,assign,reload,replaceDOM(26)doctype,implementation,documentElement,createElement,createDocumentFragment,createTextNode,createComment,createAttribute,getElementsByTagName,title,refer-rer,domain,URL,body,images,applets,links,forms,anchors,cookie,open,close,write,writeln,getElementById,getElementsByNameCSS(76)backgroundAttachment,backgroundColor,backgroundImage,backgroundRepeat,bor-der,borderStyle,borderTop,borderRight,borderBottom,borderLeft,borderTopWidth,borderRightWidth,borderBottomWidth,borderLeftWidth,borderWidth,clear,color,display,font,fontFamily,fontSize,fontStyle,fontVariant,fontWeight,height,letterSpac-ing,lineHeight,listStyle,listStyleImage,listStylePosition,listStyleType,margin,margin-Top,marginRight,marginBottom,marginLeft,padding,paddingTop,paddingRight,paddingBottom,paddingLeft,textAlign,textDecoration,textIndent,textTransform,ver-ticalAlign,whiteSpace,width,wordSpacing,backgroundPosition,borderCollapse,bor-derTopColor,borderRightColor,borderBottomColor,borderLeftColor,borderTopStyle,borderRightStyle,borderBottomStyle,borderLeftStyle,bottom,clear,clip,cursor,direc-tion,left,minHeight,overow,pageBreakAfter,pageBreakBefore,position,right,table-Layout,top,unicodeBidi,visibility,zIndexToavoidfalsepositivesandensurethateachmodernbrowsercanpassthefunctionalitytest,weperformanextensivefeaturesupportmeasurementonthetop5mainstreambrowsers[15]:Chrome,Firefox,IE,Safari,andOpera.
Todiscerntheconsistentlysupportedfeatures,weuniformlyselect10versionsforeachbrowservendorwiththeexceptionof5versionsforIE.
Table1liststhebrowserswetested.
Asaresult,weobtainasetof153featuresassociatedwithwebstandards,includingbrowserwindow,DOM,andCSS(seeTable2).
Allthosefeaturesaresupportedbybothdesktopbrowsersandtheirmobileversions.
Thesefeaturesarecommonlyandconsistentlysupportedbythe45versionsofbrowsersinthepasttenyears.
Wecallthissettheauthentic-featureset.
Wealsocreateabogus-featureset,whichhasthesamesizeastheauthentic-featuresetbutisobtainedbyappending"123"toeachfeatureintheauthentic-featureset.
Thus,everyfeatureinthebogus-featuresetshouldnotbesupportedbyanyrealbrowser.
Notethatwejustusethestring"123"asanexample.
Whenimplementingourdetection,theadvertisershouldperiodicallychangethestringtomakethebogus-featuresethardtoevade.
HowtoPerformtheFunctionalityTest.
Figure3illustrateshowthefunc-tionalitytestisperformed.
FortherstHTTPrequestissuedbyaclient,thead-vertiser'swebserverchallengestheclientbyrespondingasusual,butalongwithamixedsetofauthenticandbogusfeatures.
Whilethesizeofthemixedsetisxed(e.
g.
,100),theproportionofauthenticfeaturesinthesetisrandomlyde-cided.
Then,thoseindividualauthenticandforgedfeaturesinthesetarerandomlyselectedfromtheauthenticandbogusfeaturesets,respectively.
Theclientisex-pectedtotesteachfeatureinitsenvironmentandthenreporttothewebserverhowmanyauthenticfeaturesareinthemixedsetastheresponsetothechallenge.
ClickFraudDetectionontheAdvertiserSide4272.
HTTPResponse&amixsetofauthentic/bogusfeatures1.
HTTPRequestClientAdvertiser'sWebServer3.
Report#ofauthenticfeaturestoserverasresponsetochallengeFig.
3.
Howthefunctionalitytestisperformedbyadvertiser'swebserverArealbrowsershouldbeabletoreportthecorrectnumberofauthenticfea-turestothewebserverafterexecutingthechallengecode,andthuspassesthefunctionalitytest.
However,aclickbotwouldfailthetestbecauseitisunabletotestthefeaturescontainedinthesetandreturnthecorrectnumber.
Consideringsomeuntestedbrowsersmaynotsupportsomeauthenticfeatures,wesetupanarrowrange[xN,x]tohandlethis,wherexistheexpectednumberandNisasmallnon-negativeinteger.
Aclientisbelievedtopassthetestaslongasitsreportednumberfallswithin[xN,x].
HerewesetNto4basedonourmeasurementresults.
EvasionAnalysis.
Assumethataclientreceivesamixedsetof150featuresfromawebserverandthesetconsistsof29randomlyselectedauthenticfeaturesand121randomlyselectedbogusfeatures.
Thus,theexpectednumbershouldfallintotherange[25,29].
Consideracraftyclickbotwhoknowsaboutourdetectionmechanisminadvance.
Theclickbotdoesnotneedtotestthefeatures,butjustguessesanumberfromthepossiblerange[0,150],andreturnsittotheserver.
Inthiscase,theprobabilityfortheguessednumbertosuccessfullyfallinto[25,29]isonly3%.
Thus,theclickbothaslittlechance(3%)tobypassthefunctionalitytest.
3.
3BrowsingBehaviorExaminationPassingthefunctionalitytestcannotguaranteethataclickisvalid.
Anad-vancedclickbotmayfunctionlikearealbrowserandthuscancircumventthefunctionalitytest.
Ahumanclickerwitharealbrowsercanalsopassthetest.
However,clickbotsandhumanclickersusuallyshowquitedierentbrowsingbehaviorsontheadvertisedwebsitefromthoseofrealusers.
Clickfraudactivitiesconductedbyclickbotsusuallyendupwithloadingtheadvertiser'slandingpageanddonotshowhumanbehaviorsonthesite.
Forhumanclickers,theironlypurposeistomakemoremoneybyclickingonadsasquicklyaspossible.
Theytendtobrowseanadvertisedsitequicklyandthennavigateawayforthenextclicktask.
Instead,realinteresteduserstendtolearnmoreaboutaproductandspendmoretimeontheadvertisedsite.
Theyusuallyscrollupanddownapage,clickontheirinterestedlinks,browsesmultiplepages,andsometimesmakeapurchase.
Therefore,weleverageusers'browsingbehaviorsontheadvertisedsitetodetecthumanclickersandadvancedclickbots.
Specically,weextractextensive428H.
Xuetal.
Table3.
SummaryofouradcampaignsSetCampaignClicksImpressionsCTRInvalidClicksInvalidRateAvg.
CPCDailyBudgetDuration(days)1bait11,011417,6440.
24%42529.
60%$0.
08$15.
00102bait24,127646,1520.
64%85217.
11%$0.
03$15.
00103bait35,324933,7900.
57%1,45521.
46%$0.
04$15.
00104normal128868,4250.
42%185.
88%$0.
40$20.
00105normal222420,7841.
08%104.
27%$0.
48$20.
0010TotalNA10,9742,086,7950.
53%2,76025.
15%$0.
06$85.
0010featuresfrompassivelycollectedbrowsingtracontheadvertisedwebsite,andtrainaclassierfordetection.
4ExperimentalResultsInordertoevaluateourapproach,werunadcampaignstocollectreal-worldclicktrac,andthenanalyzethecollecteddatatodiscernitsprimarycharacteristics,resultinginatechniquetoclassifyclicktracaseitherfraudulent,casual,orvalid.
4.
1RunningAdCampaignsToobtainreal-worldclicktrac,wesignedupwithamajoradnetworkandranadcampaignsforahigh-tracwoodworkingforumwebsite.
Motivatedbythebaitadtechniqueproposedin[11],wecreatedthreebaitadsforthesiteandmadethesameassumptionasthepreviousworks[4,11,16],thatveryfewpeoplewouldintentionallyclickonthebaitadsandthoseadsaregenerallyclickedbyclickbotsandfraudulenthumanclickers.
Baitadsaretextualadswithnonsensecontent,asillustratedinFigure4.
NotethatourbaitadsweregeneratedinEnglish.
Inaddition,wecreatedtwonormalads,forwhichtheadtextsdescribetheadvertisedsiteexactly.
Ourgoalofrunningadcampaignsistoacquirebothmaliciousandauthenticclicktracforvalidatingourclickfrauddetectionsystem.
Tothisend,wesetthebaitadstobedisplayedonpartnerwebsitesofanylanguageacrosstheworldbutdisplaynormaladsonlyonsearchresultpagesinEnglishtoavoidpublisherfraudcasesfrombiasingtheclicksonthelatternormalads.
Weexpectthatmost,ifnotall,clicksonbaitadsandnormaladsarefraudulentandauthentic,respectively.
Weranouradcampaignsfor10days.
Table3providesasummaryofouradcampaigns.
Ouradshad2millionimpressions2,receivednearly11thousandclicksandhadaclick-throughrate(CTR)of0.
53%onaverage.
Amongthese,2.
7thousandclickswereconsideredbytheadnetworkasillegitimateandwerenotcharged.
Theinvalidclickratewas25.
15%.
Theaveragecostperclick(CPC)was$0.
06.
Notethatthetwonormaladsonlyreceived512clicksaccountingfor4.
67%ofthetotal.
Thereasonisthatalthoughweprovidedquitehighbidsfor2Anadbeingdisplayedonceiscountedasoneimpression.
ClickFraudDetectionontheAdvertiserSide429AnchorGroundhogEstatewww.
sawmillcreek.
orgVarianceFlockAccurateChandelierCradleNaphthaLibrettistHeadwindFig.
4.
AbaitadwiththeadtextofrandomlyselectedEnglishwords55.
4/0.
115.
5/0.
013.
8/0.
042.
3/0.
732.
2/0.
072.
1/75.
81.
8/0.
091.
6/0.
011.
6/0.
061.
4/0.
33110100ChinaIraqEgyptIndiaVietnamUnitedStatesPakistanAlgeriaSaudiArabiaPhilippines%ofadclicksfromthecountry%ofnormaldailyvisitorsfromthecountryFig.
5.
Distributionofclicktracvs.
thatofnormaltracbycountrynormalads,ournormaladsstillcannotcompetewiththoseofotheradvertisersfortoppositionsandthusreceivedfewerclicks.
4.
2CharacterizingtheClickTracWecharacterizethereceivedclicktracbyanalyzingusers'geographicdistri-bution,browsertype,IPaddressreputation,andreferrerwebsites'reputations.
Ourgoal,throughstatisticalanalysis,istohaveabetterunderstandingofboththeuserswhoclickedonouradsandthereferrerwebsiteswhereouradswereclicked.
Althoughtheadnetworkreportedthatouradsattractedcloseto11thousandclicks,weonlycaughtontheadvertisedsite9.
9thousandclicks,whichserveasdataobjectsforbothcloserexaminationandvalidationofourapproach.
GeographicDistribution.
Weobtainusers'geographicinformationusinganIPgeolocationlookupservice[17].
Our9.
9thousandclicksoriginatefrom156countries.
Figure5showsthedistributionofadclicksbythetop10countrieswhichgeneratethemostclicks.
ThedistributionofnormaldailyvisitorstotheadvertisedsitebycountryisalsogiveninFigure5.
Notethatthedataform'X/Y'meansthatX%ofadclicksandY%ofnormaldailyvisitorsarefromthatspeciccountry.
Thetop10countriescontribute77.
7%ofoverallclicks.
Chinaalonecontributesover55%oftheclicks,whiletheUnitedStatescontributes2.
1%.
ThisisquiteunusualbecausethenormaldailyvisitorsfromChinaonlyaccountfor0.
11%whilethenormalvisitorsfromtheUnitedStatescloseto76%.
LikeChina,Egypt,Iraq,andothergenerallynon-Englishcountriesalsocontributemuchhighersharesofadclicktracthantheirnormaldailytractothesite.
Thepublisherwebsitesfromthesecountriesaresuspectedtobeusingbotstoclickonourads.
Evenworse,onestrategyofouradnetworkpartnermayaggravatethefraudulentactivities.
Thestrategysaysthatwhenanadhasahighclickthroughratioonapublisherwebsite,theadnetworkwilldelivertheadtothatpublisherwebsitemorefrequently.
Toguaranteethatouradsattractas430H.
Xuetal.
48.
723.
419.
85.
51.
60.
80.
20102030405060%ofClicksFig.
6.
Distributionofclicktracbybrowser23.
112.
69.
75.
24.
23.
12.
32.
121.
7340510152025303540%ofClicksFig.
7.
Distributionofclicktracbypublishermanyclicksaspossiblewithinadailybudget,theadnetworkmaydeliverouradstothosenon-Englishwebsitesmoreoften.
BrowserType.
Nextweexaminethedistributionofthebrowserstoseewhichbrowservendorsaremostlyusedbyuserstoviewandclickonourads.
WeextractedthebrowserinformationfromtheUser-AgentstringsoftheHTTPrequeststoouradvertisedwebsite.
Figure6showsthedistributionofthebrowsersusedbyouradclickers.
IE,Chrome,Firefox,Safari,andOperaarethetop5desktopandlaptopbrowsers,whichisconsistentwiththewebbrowserpopularitystatisticsfromStatCounter[15].
Notably,mobilebrowsersalonecontributetonearly50%ofoveralltrac,muchlargerthantheestimatedusageshareofmobilebrowsers(about18%[18]).
Closescrutinizationrevealsthat40%ofthetracwithmobilebrowsersoriginatesfromChina.
Chinageneratedover50percentofoveralltrac,whichskewsthebrowserdistribution.
Blacklists.
Afractionofourdatacouldbegeneratedbyclickbotsandcom-promisedhosts.
Thosemaliciousclientscouldalsobeutilizedbyfraudsterstoconductotherundesirableactivities,andarethusblacklisted.
Bylookingupusers'IPaddressesinpublicIPblacklists[19],wefoundthat29%ofthetotalhostshaveeverbeenblacklisted.
Referrers.
Anotherinterestingquestionwouldbewhichwebsiteshostouradsandiftheircontentsarereallyrelatedtothekeywordsofourads.
Accordingtothecontextualtargetingpolicyoftheadnetwork,anadshouldbedeliveredtotheadnetwork'spartnerwebsiteswhosecontentsmatchtheselectedkeywordsforthead.
WeusedtheReferereldintheHTTPrequestheadertolocatethepublish-ersthatdisplayedouradsandthendirecteduserstoouradvertisedwebsite.
However,wecanonlyidentifypublishersforonly37.
2%ofthetrac(3,685clicks)becausetheremainingtraceitherhasablankReferereldorhastheClickFraudDetectionontheAdvertiserSide431domainoftheadnetworkasthereferereld.
Forexample,theReferereldformorethan40%oftrachastheformofdoubleclick.
net.
Wethenexamined,amongthosedetectedpublishers,whichwebsitescontributetothemostclicks.
Notethatpublisherscouldbewebsitesormobileapps.
Weidentied499uniquewebsitesand5appsintotal.
ThoseappsarealliPhoneappsandonlygenerate28clicksalltogether.
Theremaining3,657clicksarefromthe499uniqueweb-sites.
Figure7showsthedistributionoftheclicktracbythose504publishers.
Thetop3websiteswiththemostclicksonouradsareallsmallgamewebsites,whichcontributetoover45%ofpublisher-detectableclicks.
Actually,thetop7websitesareallsmallgamewebsites.
Smallgamewebsitesoftenattractmanyvisitors,andthustheadsonthosewebsitesaremorelikelytobeclickedon.
However,ourkeywordsareallwoodworking-relatedandevidently,thecontentsofthosegamewebsitesdonotmatchourkeywords.
Accordingtotheabovementionedcontextualtargetingpolicy,theadnetworkshouldhavenotdeliveredouradstosuchwebsites.
Onepossiblereasonisthatfromtheperspectiveoftheadnetwork,attractingclickstakesprecedenceovermatchingtheadswithhostwebsites.
4.
3ValidatingDetectionApproachAsdescribedbefore,ourapproachiscomposedofthreemaincomponents:aJavaScriptsupportandmouseeventtest,afunctionalitytest,andabrowsingbehaviorexamination.
Hereweindividuallyvalidatetheireectiveness.
JavaScriptSupportandMouseEventTest.
Amongthe9.
9thousandadclicksloggedbytheadvertisedsite,75.
2%ofusersdonotsupportJavaScript.
Welabelledthoseusersasclickbots.
Notethatthispercentagemaybeslightlyover-estimatedconsideringthatsomeusers(atmost2%[14])mayhaveJavaScriptdisabled.
Inaddition,thosevisitswithoutsupportforJavaScriptdonotcorre-latewithvisitsfrommobilebrowsers.
WehavecheckedthatnearlyallmobilebrowsersprovidesupportforJavaScriptdespitelimitedcomputingpower.
Wethenfocusedonthetop10publisherwebsiteswiththemostclickstoiden-tifypotentiallymaliciouspublishers.
Figure8depictsthepercentageofclickswithoutscriptsupportfromthosetop10publishers.
Amongthem,thetwonon-entertainmentwebsitesgoogle.
comandask.
comhavelowratios,9.
4%and15.
2%,respectively.
Incontrast,theother8entertainmentwebsiteshavequitehighclickratioswithoutscriptsupport.
Thereare86visitsfromtvmao.
comandnoneofthemsupportJavaScript.
Webelievethatall86clicksarefraudulentandgener-atedbybots.
Similarly,99.
1%ofclicksfromweaponsgames.
com,96.
1%ofclicksfrom3dgames.
org,and95.
3%fromgamesgirl.
netarewithoutJavaScriptsupporteither.
Suchhighratiosindicatethattheinvalidclickrateinthereal-worldadcampaignsismuchlargerthantheaverageinvalidrateof25.
15%allegedbytheadnetworkforouradcampaigns,asshowninTable3.
Weobserved506adclicks(withJavaScriptsupport)thatresultinzeromouseeventswhenarrivingatourtargetsite.
Ofthose,96areinitiatedfrommobileplatformsincludingiPad,iPhone,Android,andWindowsPhone.
Theremaining432H.
Xuetal.
69.
60%99.
10%34.
30%9.
40%96.
10%36.
30%100%15.
20%84%95.
30%8534623561921551138679756459445812218149418612636101002003004005006007008009000.
00%20.
00%40.
00%60.
00%80.
00%100.
00%120.
00%%ofclicksfromthewebsitew/oJavaScriptsupport#ofclicksfromthewebsite#ofclicksfromthewebsitew/oJavaScriptsupportFig.
8.
PercentageofclickswithoutJavaScriptsupportforthetop10publisherweb-sitescontributingthemostclicks410clicksaregeneratedfromdesktoporlaptopplatforms.
Those410adclicksalsohavefewotherkindsofuserengagement:nomouseclicks,nopagescrolls,andshortdwellingtime.
Welabelledthemasclickbots.
Wefurtherinvestigatedtheclicktracfrom4399.
comduetothefactthatthiswebsitegeneratedthemostclicksonouradsamongallidentiedpublishers.
Thefollowingseveralpiecesofdataindicatetheexistenceofpublisherfraud.
First,all853clicksfrom4399.
comweregeneratedwithinoneday.
Notably,upto95clicksweregeneratedwithinonehour.
Second,severalIPswerefoundtoclickonouradsmultipletimeswithinoneminuteusingthesameUser-Agent,andoneUser-Agentwaslinkedtoalmost15clicksonaverage.
Third,closeto70%ofclientsdidnotsupportJavaScript.
Hencewesuspectthatthewebsiteownerusedautomatedscriptstogeneratefraudulentclicksonourads.
However,thescriptsarelikelyincapableofexecutingtheJavaScriptcodeattachedtoourads.
Inaddition,theyprobablyspoofedIPaddressandUser-AgenteldsintheHTTPrequeststoavoiddetection.
FunctionalityTest.
Theclickbotsthatcannotworkasfull-edgedmodernbrowsersareexpectedtofailourfunctionalitytest.
Amongthelogged9.
9thou-sandclicks,7,448clickswithoutJavaScriptsupportdidnottriggerthefunction-alitytest,and35oftheremainingclickswithJavaScriptsupportwereobservedtofailthefunctionalitytestandweresubsequentlylabelledasclickbots.
Sofar,75.
6%ofclicks(7,483clicks)hadbeenidentiedbyourdetectionmechanismtooriginatefromclickbots.
Amongthem,99.
5%(7,448clicks)weresimpleclick-ClickFraudDetectionontheAdvertiserSide433Table4.
FeaturesextractedforeachadclickFeatureCategoryFeatureDescriptionMouseclicks#oftotalclicksmadeontheadvertisedsite#ofclicksmadeonlyonthepagesexcludingthelandingpage#ofclicksexclusivelymadeonhyperlinksMousescrolls#ofscrolleventsintotal#ofscrolleventsmadeonthepagesexcludingthelandingpageMousemoves#ofmousemoveeventsintotal#ofmousemoveeventsmadeonlyonthepagesexcludingthelandingpagePagesviews#ofpagesviewedbyauserVisitdurationHowlongauserstaysonthesiteExecutioneciencyClient'sexecutiontimeofJavaScriptcodeforchallengeLegitimacyoforiginIfthesourceIPisinanyblacklistPublisher'sreputationIftheclickoriginatesfromandisreputablewebsitebotswithoutJavaScriptsupport;andtherest0.
5%(35clicks)wererelativelyadvancedclickbotswithJavaScriptsupportyetfailedthefunctionalitytest.
BrowsingBehaviorExamination.
Aftercompletingthetwostepsaboveanddiscardingincompleteclickdata,1,479adclicks(14.
9%)arelefttobelabelled.
Amongthem,1,127adclicksareonbaitadswhiletheother352clicksareonnormalads.
Herewefurtherclassifytheclicktracintothreecategories—fraudulent,casual,andvalid—basedonuserengagement,clientIP,andpublisherreputationinformation.
Features.
Webelievethatthreekindsoffeaturesareeectivetodierentiateadvancedclickbotsandhumanclickersfromrealusers.
(1)Howusersbehaveattheadvertisedsite,i.
e.
,users'browsingbehaviorinformation.
(2)Whoclicksonourads,andahostwithabadIPismorelikelytoissuefraudulentclicks.
(3)Whereauserclicksonads,andaclickoriginatingfromadisreputablewebsitetendstobefraudulent.
Table4enumeratesallthefeaturesweextractedfromeachadclicktractocharacterizeusers'browsingbehaviorsontheadvertisedsite.
Groundtruth.
Previousworks[4,11,16]allassumethatveryfewpeoplewouldintentionallyclickonbaitadsandonlyclickbotsandhumanclickerswouldclickonsuchads.
Thatis,aclickonabaitadisthoughttobefraudulent.
However,thisassumptionistooabsolute.
Considerthefollowingsituation.
Arealuserclicksonabaitadunintentionallyorjustoutofcuriosity,withoutmaliciousintention.
Then,theuserhappenstoliketheadvertisedproductsandbeginsbrowsingtheadvertisedsite.
Inthiscase,theadclickgeneratedbythisusershouldnotbelabelledasfraudulent.
Thus,tominimizefalsepositives,wepartlyaccepttheabovecommonassumption,scrutinizethosebaitadclickswhichhaveshownrichhumanbehaviorsontheadvertisedsite,andcorrecta-priorilabelsbasedonthefollowingheuristics.
Specically,forabaitadclick,ifthehostIPaddressisnotinanyblacklistandthereferrerwebsitehasagoodreputation,thisadclickisrelabelledasvalidwhenoneofthefollowingconditionsholds:(1)30secondsofdwellingtime,15mouseevents,and1click;(2)30secondsofdwellingtime,10mouseevents,1scrollevent,and1click;and(3)30secondsofdwellingtime,10mouseevents,and2pageviews.
Webelievetheaboveconditionsarestrictenoughtoavoidmislabellingtheadclicksgeneratedbybotsandhumanclickersasvalidclicks.
434H.
Xuetal.
Notethatournormaladsareonlydisplayedonthesearchengineresultpageswiththeexpectationthatmost,ifnotall,clicksonnormaladsarevalid.
TheadcampaignreportprovidedbytheadnetworkinTable3conrmsthis,showingthattheinvalidclickratefornormaladsisonly5.
08%onaverage.
Basedonourdesignandtheadcampaignreport,webasicallyassumethattheclicksonnormaladsarevalid.
However,afterfurthermanuallycheckingthenormaladclicks,wefoundthatsomeofthemdonotdemonstratesucienthumanbehaviors,andthesenormaladclickswillberelabelledascasualwhenoneofthefollowingtwoconditionsholds:(1)lessthan5secondsofdwellingtime;(2)lessthan10secondsofdwellingtimeandlessthan5mouseevents.
Thecasualclicktraccouldbeissuedbyhumanuserswhounintentionallyclickonadsandthenimmediatelynavigateawayfromtheadvertisedsite.
Fromtheadvertisers'perspective,suchaclicktracdoesnotprovideanyvaluewhenevaluatingtheROIoftheiradcampaignsonaspecicadnetwork,andthereforeshouldbeclassiedascasual.
Actually,ifthereisnonancialtransactioninvolved,onlyauser'sintentionmatterswhetherthecorrespondingadclickisfraudulentornot.
Thatis,onlyusersthemselvesknowtheexactgroundtruthforfraudulent/valid/casualclicks.
Forthoseclickswithouttriggeringanynancialtransactions,weutilizetheabovereasonableassumptionsandstraightforwardheuristicstoformthegroundtruthforfraudulent/valid/casualclicks.
Evaluationmetrics.
Weevaluatedourdetectionagainsttwometrics—falsepositiverateandfalsenegativerate.
Afalsepositiveiswhenavalidclickiswronglylabelledasfraudulent,andafalsenegativeiswhenafraudulentclickisincorrectlylabelledasvalid.
Classicationresults.
UsingWeka[20],wechoseaC4.
5pruneddecisiontree[21]withdefaultparametervalues(i.
e.
,0.
25forcondencefactorand2forminimumnumberofinstancesperleaf)astheclassicationalgorithm,andrana10-foldcross-validation.
Thefalsepositiverateandfalsenegativeratewere6.
1%and5.
6%,respectively.
Notethatthesearetheclassicationresultsonthose1,479unlabelledclicks.
Asawhole,ourapproachshowedahighdetectionaccuracyonthetotal9.
9thousandclicks,withafalsepositiverateof0.
79%andafalsenegativerateof5.
6%,andtheoveralldetectionaccuracyis99.
1%.
Overhead.
Weassessedtheoverheadinducedbyourdetectionontheclientandserversides,intermsoftimedelay,CPU,memoryandstorageusages.
TheonlyextraworkrequiredoftheclientistheexecutionofaJavaScriptchallengescriptandtoreportthefunctionalitytestresultstotheserverasanAJAXPOSTrequest.
Wemeasuredtheoverheadontheclientsideusingtwometrics:sourcelinesofcode(SLOC)andtheexecutiontimeofJavaScriptcode.
TheJavaScriptcodeisonlyabout150SLOCandweobservednegligibleimpactontheclient.
Wealsoestimatedtheclient'sexecutiontimeofJavaScriptfromtheserversidetoavoidthepossibilitythattheclientcouldreportabogusexecutiontime.
Notethattheexecutiontimemeasuredbytheservercontainsaroundtriptime,whichmakestheestimatedexecutiontimelargerthantheactualexecutiontime.
Figure9depictsthe9.
9thousandclients'executiontimeoftheJavaScriptchallengecode.
About80%ofclientsnishedexecutionwithinClickFraudDetectionontheAdvertiserSide43533.
644.
59.
53.
11.
70.
80.
70.
90.
60.
44.
205101520253035404550%ofClicksFig.
9.
Clients'executiontimeofJavaScriptchallengecodeinmillisecondsonesecond.
Assumingthattheroundtriptime(RTT)is200milliseconds,theactualcomputationoverheadincurredattheclientsideismerelyseveralhundredmilliseconds.
WeusedtheSAR(SystemActivityReport)[22]toanalyzeserverperformanceandmeasuretheoverheadontheserverside.
Weobservednospikeinserverload.
Thisisbecausemostofworkinvolvedinourdetectionhappensontheclientside,andtheinducedclick-relatedtracisinsignicantincomparisonwithserver'snormaltrac.
5DiscussionandLimitationsInthispaper,weassumethataclickbottypicallydoesnotincludeitsownJavaScriptengineoraccessthefullsoftwarestackofalegitimatewebbrowserre-sidingontheinfectedhost.
Asophisticatedclickbotimplementingafullbrowseragentitselfwouldgreatlyincreaseitspresenceandthelikelihoodofbeingde-tected.
Aclickbotmightalsoutilizealegitimatewebbrowsertogenerateactivi-ties,andcanthuspassourbrowserfunctionalitytest.
Toidentifysuchclickbots,wecouldfurtherdetectwhetherouradsandtheadvertisedwebsitesarereallyvisibletousersbyutilizinganewfeatureprovidedbysomeadnetworks.
ThenewfeatureallowsadvertiserstoinstrumenttheiradswithJavaScriptcodeforabetterunderstandingofwhatishappeningtotheiradsontheclientside.
Withthisfeature,wecoulddetectifouradiframeisvisibleattheclient'sfront-endscreenratherthaninthebackground,andifitisreallyfocusedandclickedon.
Inaddition,comparedtoouruser-visitrelatedfeatures(dwellingtime,mouseevents,scrollevents,clicksandetc.
),user-conversationrelatedfeatures3areex-pectedtohavebetterdiscriminatingpowerbetweenclickbots,humanclickers,andrealusersinbrowsingbehaviors.
However,ouradvertisedsiteisaprofes-sionalforumratherthananonlineretailer.
Ifauserregisters(createsanaccount)ontheforum,itisanalogoustoapurchaseatanonlineretailer.
However,suchconversionfromguesttomemberisaneventtooraretorelyupontoenhanceourclassier.
3Purchasingaproduct,abandoninganonlinecart,proactiveonlinechat,etc.
436H.
Xuetal.
6RelatedWorkBrowserFingerprinting.
Browserngerprintingallowsawebsitetoidentifyaclientbrowsereventhoughtheclientdisablescookies.
Existingbrowsern-gerprintingtechniquescouldbemainlyclassiedintotwocategories,basedontheinformationtheyneedforngerprinting.
Therstcategoryngerprintsabrowserbycollectingapplication-layerinformation,includingHTTPrequestheaderinformationandsystemcongurationinformationfromthebrowser[23].
Thesecondcategoryperformsbrowserngerprintingbyexaminingcoarsetracgeneratedbythebrowsers[24].
However,bothofthemhavetheirlimitationsindetectingclickbots.
Nearlyalltheapplication-layerinformationcanbespoofedbysophisticatedclickbots,andbrowserngerprintsmaychangequiterapidlyovertime[23].
Inaddition,anadvertiseroftencannotcollectenoughtracinformationforngerprintingtheclientfromjustonevisittotheadvertiser.
Comparedtotheexistingbrowserngerprintingtechniques,ourfeaturedetec-tiontechniquehasthreemainadvantages.
First,clickbotscannoteasilypassthefunctionalitytestunlesstheyhaveimplementedthemainfunctionalitypresentinmodernbrowsers.
Second,theclient'sfunctionalitycouldbetestedthoroughlyattheadvertiser'ssideeventhoughtheclientvisitstheadvertiser'slandingpageonlyonce.
Lastly,ourtechniqueworksovertimeasnewbrowsersappearbecausenewbrowsersshouldalsoconformtothethosewebstandardscurrentlysupportedbymodernbrowsers.
RevealedClickFraud.
Severalpreviousstudiesinvestigateknownclickfraudactivities,andclickbotshavebeenfoundtobecontinuouslyevolvingandbecomemoresophisticated.
Astherststudytoanalyzethefunctionalityofaclickbot,Daswanietal.
[3]dissectedClickbot.
Aandfoundthattheclickbotcouldcarryoutalow-noiseclickfraudattacktoavoiddetection.
Milleretal.
[5]exam-inedtwootherfamiliesofclickbots.
TheyfoundthatthesetwoclickbotsweremoreadvancedthanClickbot.
Ainevadingclickfrauddetection.
Oneclickbotintroducesindirectionbetweenbotsandadnetworks,whiletheothersimulateshumanwebbrowsingbehaviors.
Someothercharacteristicsofclickbotsarede-scribedin[4].
Clickbotsgeneratefraudulentclicksperiodicallyandonlyissueonefraudulentclickinthebackgroundwhenalegitimateuserclicksonalink,whichmakesfraudulenttrachardlydistinguishablefromlegitimateclicktraf-c.
Normalbrowsersmayalsobeexploitedtogeneratefraudulentclicktrac.
Thetracgeneratedbyanormalbrowsercouldbehijackedbycurrentlyvisitedmaliciouspublishersandbefurtherconvertedtofraudulentclicks[7].
Ghostclickbotnet[6]leveragesDNSchangermalwaretoconvertavictim'slocalDNSre-solverintoamaliciousoneandthenlaunchesadreplacementandclickhijackingattacks.
Ourdetectioncanidentifyeachoftheseclickbotsbyactivelyperformingafunctionalitytestandcandetectallotherkindsofclickfraudbyexaminingtheirbrowsingbehaviortracontheserverside.
ClickFraudDetection.
Metwallyetal.
conductedananalysisonadnetworks'traclogstodetectpublishers'non-coalitionhitinationfraud[8],coalitionfraud[9],andduplicateclicks[10].
ThemainlimitationoftheseworksliesinthatadClickFraudDetectionontheAdvertiserSide437networks'traclogsareusuallynotavailabletoadvertisers.
Haddadiin[11]andDaveetal.
in[4]suggestedthatadvertisersusebaitadstodetectfraudulentclicksontheirads.
Whilebaitadshavebeenproveneectiveindetection,advertisershavetospendextramoneyonthosebaitads.
Daveetal.
[16]presentedanap-proachtodetectingfraudulentclicksfromanadnetwork'sperspectiveratherthananadvertiser'sperspective.
Lietal.
[7]introducedtheaddeliverypathrelatedfea-turestodetectmaliciouspublishersandadnetworks.
However,monitoringandreconstructingtheaddeliverypathistime-consuminganddiculttodetectclickfraudsinrealtime.
Schulteetal.
[25]detectedclient-sidemalwareusingso-calledprograminteractivechallenge(PIC)mechanism.
However,anintermediateproxyhastobeintroducedtoexamineallHTTPtracbetweenaclientandaserver,whichwouldinevitablyincursignicantdelay.
Like[4,11],ourdefenseworksattheserversidebutdoesnotcauseanyextracostforadvertisers.
Ourworkisthersttodetectclickbotsbytestingtheirfunctionalitiesagainstthespecicationswidelyconformedtobymodernbrowsers.
Mostclickbotscanbedetectedatthisstep,becausetheyhaveeithernosuchfunctionalitiesorlimitedfunctionalitiescomparedtomodernbrowsers.
Fortheadvancedclickbotsandhumanclickers,wescrutinizetheirbrowsingbehaviorsontheadvertisedsite,extracteectivefea-tures,andtrainaclassiertoidentifythem.
7ConclusionInthispaper,wehaveproposedanewapproachforadvertiserstoindependentlydetectclickfraudactivitiesissuedbyclickbotsandhumanclickers.
Ourproposeddetectionsystemperformstwomaintasksofproactivefunctionalitytestingandpassivebrowsingbehaviorexamination.
Thepurposeofthersttaskistodetectclickbots.
Itrequiresaclienttoactivelyproveitsauthenticityofafull-edgedbrowserbyexecutingapieceofJavaScriptcode.
Formoresophisticatedclick-botsandhumanclickers,wefulllthesecondtaskbyobservingwhatauserdoesontheadvertisedsite.
Moreover,wescrutinizewhoinitiatestheclickandwhichpublisherwebsiteleadstheusertotheadvertiser'ssite,bycheckingthelegitimacyoftheclients'IPaddresses(source)andthereputationofthere-ferringsite(intermediate),respectively.
Wehaveimplementedaprototypeanddeployeditonalargeproductionwebsiteforperformanceevaluation.
Wehavethenrunarealadcampaignforthewebsiteonamajoradnetwork,duringwhichwecharacterizedtherealclicktracfromtheadcampaignandprovidedadvertisersabetterunderstandingofadclicktrac,intermsofgeographicaldistributionandpublisherwebsitedistribution.
Usingtherealadcampaigndata,wehavedemonstratedthatourdetectionsystemiseectiveinthedetectionofclickfraud.
References1.
https://en.
wikipedia.
org/wiki/Online_advertising2.
http://www.
spider.
io/blog/2013/03/chameleon-botnet/438H.
Xuetal.
3.
Daswani,N.
,Stoppelman,M.
:Theanatomyofclickbot.
a.
In:ProceedingsoftheWorkshoponHotTopicsinUnderstandingBotnets(2007)4.
Dave,V.
,Guha,S.
,Zhang,Y.
:Measuringandngerprintingclick-spaminadnetworks.
In:ProceedingsoftheAnnualConferenceoftheACMSpecialInterestGrouponDataCommunication(2012)5.
Miller,B.
,Pearce,P.
,Grier,C.
,Kreibich,C.
,Paxson,V.
:What'sclickingwhattechniquesandinnovationsoftoday'sclickbots.
In:Holz,T.
,Bos,H.
(eds.
)DIMVA2011.
LNCS,vol.
6739,pp.
164–183.
Springer,Heidelberg(2011)6.
Alrwais,S.
A.
,Dun,C.
W.
,Gupta,M.
,Gerber,A.
,Spatscheck,O.
,Osterweil,E.
:Dissectingghostclicks:Adfraudviamisdirectedhumanclicks.
In:ProceedingsoftheAnnualComputerSecurityApplicationsConference(2012)7.
Li,Z.
,Zhang,K.
,Xie,Y.
,Yu,F.
,Wang,X.
:Knowingyourenemy:Understandinganddetectingmaliciouswebadvertising.
In:ProceedingsoftheACMConferenceonComputerandCommunicationsSecurity(2012)8.
Metwally,A.
:Sleuth:Single-publisherattackdetectionusingcorrelationhunting.
In:ProceedingsoftheInternationalConferenceonVeryLargeDataBases(2008)9.
Metwally,A.
:Detectives:Detectingcoalitionhitinationattacksinadvertisingnetworksstreams.
In:ProceedingsoftheInternationalConferenceonWorldWideWeb(2007)10.
Metwally,A.
,Agrawal,D.
,Abbadi,A.
E.
:Duplicatedetectioninclickstreams.
In:ProceedingsoftheInternationalConferenceonWorldWideWeb(2005)11.
Haddadi,H.
:Fightingonlineclick-fraudusingbluads.
In:ACMSIGCOMMCom-puterCommunicationReview(2010)12.
Daswani,N.
,Mysen,C.
,Rao,V.
,Weis,S.
,Gharachorloo,K.
,Ghosemajumder,S.
:Onlineadvertisingfraud.
In:Crimeware:UnderstandingNewAttacksandDe-fenses.
Addison-WesleyProfessional(2008)13.
http://taligarsiel.
com/Projects/howbrowserswork1.
htm14.
https://developer.
yahoo.
com/blogs/ydnfourblog/many-users-javascript-disabled-14121.
html15.
http://gs.
statcounter.
com/16.
Dave,V.
,Guha,S.
,Zhang,Y.
:Viceroi:Catchingclick-spaminsearchadnetworks.
In:ProceedingsofACMConferenceonComputerandCommunicationsSecurity(2013)17.
http://www.
maxmind.
com/en/web_services18.
http://en.
wikipedia.
org/wiki/Usage_share_of_web_browsers19.
http://www.
blacklistalert.
org/20.
http://www.
cs.
waikato.
ac.
nz/ml/weka/21.
Quinlan,J.
:C4.
5:Programsformachinelearning.
MorganKaufmannPublishers(1993)22.
http://en.
wikipedia.
org/wiki/Sar_Unix23.
Eckersley,P.
:HowuniqueisyourwebbrowserIn:ProceedingsofthePrivacyEnhancingTechnologiesSymposium(2010)24.
Yen,T.
-F.
,Huang,X.
,Monrose,F.
,Reiter,M.
K.
:Browserngerprintingfromcoarsetracsummaries:Techniquesandimplications.
In:Flegel,U.
,Bruschi,D.
(eds.
)DIMVA2009.
LNCS,vol.
5587,pp.
157–175.
Springer,Heidelberg(2009)25.
Schulte,B.
,Andrianakis,H.
,Sun,K.
,Stavrou,A.
:Netgator:Malwaredetectionusingprograminteractivechallenges.
In:Flegel,U.
,Markatos,E.
,Robertson,W.
(eds.
)DIMVA2012.
LNCS,vol.
7591,pp.
164–183.
Springer,Heidelberg(2013)

宝塔面板批量设置站点404页面

今天遇到一个网友,他在一个服务器中搭建有十几个网站,但是他之前都是采集站点数据很大,但是现在他删除数据之后希望设置可能有索引的文章给予404跳转页面。虽然他程序有默认的404页面,但是达不到他引流的目的,他希望设置统一的404页面。实际上设置还是很简单的,我们找到他是Nginx还是Apache,直接在引擎配置文件中设置即可。这里有看到他采用的是宝塔面板,直接在他的Nginx中设置。这里我们找到当前...

UCloud云服务器低至年59元

最近我们是不是在讨论较多的是关于K12教育的问题,培训机构由于资本的介入确实让家长更为焦虑,对于这样的整改我们还是很支持的。实际上,在云服务器市场中,我们也看到内卷和资本的力量,各大云服务商竞争也是相当激烈,更不用说个人和小公司服务商日子确实不好过。今天有看到UCloud发布的夏季促销活动,直接提前和双十一保价挂钩。这就是说,人家直接在暑假的时候就上线双十一的活动。早年的双十一活动会提前一周到十天...

RackNerd:美国便宜VPS,洛杉矶DC-02/纽约/芝加哥机房,4TB月流量套餐16.55美元/年

racknerd怎么样?racknerd美国便宜vps又开启促销模式了,机房优秀,有洛杉矶DC-02、纽约、芝加哥机房可选,最低配置4TB月流量套餐16.55美元/年,此外商家之前推出的最便宜的9.49美元/年套餐也补货上架,同时RackNerd美国AMD VPS套餐最低才14.18美元/年,是全网最便宜的AMD VPS套餐!RackNerd主要经营美国圣何塞、洛杉矶、达拉斯、芝加哥、亚特兰大、新...

http://www.4399.com/为你推荐
比肩工场比肩是什么意思,行比肩大运的主要意象陈嘉垣反黑阿欣是谁演的 扮演者介绍杰景新特谁给我一个李尔王中的葛罗斯特这个人物的分析?急 ....先谢谢了冯媛甑冯媛甄 康熙来了百花百游百花净斑方多少钱一盒百度关键词工具常见的关键词挖掘工具有哪些同一服务器网站同一服务器上的域名/网址无法访问www.e12.com.cn上海高中除了四大名校,接下来哪所高中最好?顺便讲下它的各方面情况www.toutoulu.com老板强大的外包装还是被快递弄断了www.xvideos.com请问www.****.com.hk 和www.****.com.cn一样吗?
花生壳动态域名 已备案未注册域名 域名备案批量查询 淘宝抢红包攻略 ubuntu更新源 好看的桌面背景图 gspeed 能外链的相册 西安服务器托管 路由跟踪 iki 石家庄服务器 香港博客 湖南铁通 zencart安装 so域名 web服务器有哪些 德国代理 wordpress安装 ssd 更多