factorwarez

warez  时间:2021-01-03  阅读:()
LearningUsefulSystemCallAttributesforAnomalyDetectionGauravTandonandPhilipK.
ChanDepartmentofComputerSciencesFloridaInstituteofTechnologyMelbourne,FL32901{gtandon,pkc}@cs.
fit.
eduAbstractTraditionalhost-basedanomalydetectionsystemsmodelnormalbehaviorofapplicationsbyanalyzingsystemcallsequences.
Currentsequenceisthenexamined(usingthemodel)foranomalousbehavior,whichcouldcorrespondtoattacks.
Thoughthesetechniqueshavebeenshowntobequiteeffective,akeyelementseemstobemissing–theinclusionandutilizationofthesystemcallarguments.
Recentresearchshowsthatsequence-basedsystemsarepronetoevasion.
Weproposeanideaoflearningdifferentrepresentationsforsystemcallarguments.
Resultsindicatethatthisinformationcanbeeffectivelyusedfordetectingmoreattackswithreasonablespaceandtimeoverhead.
IntroductionIntrusiondetectionsystems(IDSs)aregenerallycategorizedassignature-basedandanomaly-based.
Insignaturedetection,systemsaremodeleduponknownattackpatternsandthetestdataischeckedfortheoccurrenceofthesepatterns.
Suchsystemshaveahighdegreeofaccuracybutsufferfromtheinabilitytodetectnovelattacks.
Anomalydetectioncomplementssignaturedetectionbymodelingnormalbehaviorofapplications.
Significantdeviationsfromthisbehaviorareconsideredanomalous.
Suchsystemscandetectnovelattacks,butgeneratefalsealarmssincenotallanomaliesarenecessarilyhostile.
Intrusiondetectionsystemscanalsobecategorizedasnetwork-based,whichdealswithnetworktraffic;andhost-based,whereoperatingsystemeventsaremonitored.
Mostofthetraditionalhost-basedanomalydetectionsystemsfocusonsystemcallsequences,theassumptionbeingthatamaliciousactivityresultsinanabnormal(novel)sequenceofsystemcalls.
Recentresearchhasshownthatsequence-basedsystemscanbecompromisedbyconductingmimicryattacks.
Suchattacksarepossiblebyinsertingdummysystemcallswithinvalidargumentssuchthattheyformalegitimatesequenceofevents.
Adrawbackofsequence-basedapproachesliesintheirnon-utilizationofotherkeyattributes,namelythesystemcallarguments.
Theefficacyofsuchsystemsmightbeimproveduponifarichersetofattributes(returnvalue,errorstatusandotherarguments)associatedwithasystemCopyright2005,AmericanAssociationforArtificialIntelligence(www.
aaai.
org).
Allrightsreserved.
callisusedtocreatethemodel.
Inthispaperwepresentahost-basedanomalydetectionsystemthatisbaseduponsystemcallarguments.
WelearntheimportantattributesusingavariantofarulelearningalgorithmcalledLERAD.
Wealsopresentvariousargument-basedrepresentationsandcomparetheirperformancewithsomeofthewell-knownsequence-basedtechniques.
Ourmaincontributionsare:(1)weincorporatevarioussystemcallattributes(returnvalue,errorstatusandotherarguments)forbetterapplicationmodeling;(2)weproposeenrichedrepresentationsusingsystemcallsequencesandarguments;(3)weuseavariantofarulelearningalgorithmtolearntheimportantattributesfromthefeaturespace;(4)wedemonstratetheeffectivenessofourmodels(intermsofnumberofattackdetections,timeandspaceoverhead)byperformingexperimentsonthreedifferentdatasets;and(5)wepresentananalysisoftheanomaliesdetected.
Oursequence-basedmodeldetectsmoreattacksthantraditionaltechniques,indicatingthattherulelearningtechniqueisabletogeneralizewell.
Ourargument-basedsystemsareabletodetectmoreattacksthantheirsequence-basedcounterparts.
Thetimeandspacerequirementsforourmodelsarereasonableforonlinedetection.
RelatedWorkTime-delayembedding(tide)recordsexecutionsofnormalapplicationexecutionsusinglook-aheadpairs(Forrestetal.
1996).
UNIXcommandsequenceswerealsoexaminedtocaptureuserprofilesandcomputesequencesimilarityusingadjacenteventsinaslidingwindow(LaneandBrodley1997).
Sequencetime-delayembedding(stide)memorizesallcontiguoussequencesofpredetermined,fixedlengthsduringtraining(Warrender,Forrest,andPearlmutter1999).
Afurtherextension,calledsequencetime-delayembeddingwith(frequency)threshold(t-stide),wassimilartostidewiththeexceptionthatthefrequenciesofthesefixedlengthsequenceswerealsotakenintoaccount.
Raresequenceswereignoredfromthenormalsequencedatabaseinthisapproach.
Allthesetechniquesmodelednormalbehaviorbyusingfixedlengthpatternsoftrainingsequences.
AschemetogeneratevariablelengthpatternsbyusingTeiresias(RigoutsosandFloratos1998),apattern-discoveryalgorithminbiologicalsequences,wasproposedin(Wespi,Dacier,andDebar1999,2000).
Thesetechniquesimproveduponthefixedlengthmethods.
Thoughalltheaboveapproachesusesystemcallsequences,noneofthemmakeuseofthesystemcallarguments.
GivensomeknowledgeabouttheIDS,attackerscandevisesomemethodologiestoevadesuchintrusiondetectionsystems(Tan,Killourhy,andMaxion2002;WagnerandSoto2002).
Suchattacksmightbedetectedifthesystemcallargumentsarealsoevaluated(Kruegeletal.
2003),andthismotivatesourcurrentwork.
Ourtechniquemodelsonlytheimportantcharacteristicsandgeneralizesfromit;previousworkemphasizesonthestructureofallthearguments.
ApproachSinceourgoalistodetecthost-basedintrusions,systemcallsareinstrumentalinoursystem.
Weincorporatethesystemcallswithitsargumentstogeneratearichermodel.
ThenwepresentdifferentrepresentationsformodelingasystemusingLERAD,whichisdiscussednext.
LearningRulesforAnomalyDetection(LERAD)Algorithmsforfindingassociationrules,suchasApriori(Agrawal,Imielinski,andSwami1993),generatealargenumberofrules.
Thisincursalargeoverheadandmaynotbeappropriateforonlinedetection.
Wewouldliketohaveaminimalsetofrulesdescribingthenormaltrainingdata.
LERADisaconditionalrule-learningalgorithmthatformsasmallsetofrules.
Itisbrieflydescribedhere;moredetailscanbeobtainedfrom(MahoneyandChan2003).
LERADlearnsrulesoftheform:},,{,,21KKxxXbBaA∈==(1)whereA,B,andXareattributesanda,b,x1,x2arevaluesforthecorrespondingattributes.
Thelearnedrulesrepresentthepatternspresentinthenormaltrainingdata.
Theset{x1,x2,…}intheconsequentconstitutesalluniquevaluesofXwhentheantecedentoccursinthetrainingdata.
Duringthedetectionphase,records(ortuples)thatmatchtheantecedentbutnottheconsequentofaruleareconsideredanomalousandananomalyscoreisassociatedwitheveryruleviolation.
Thedegreeofanomalyisbasedonaprobabilisticmodel.
Foreachrule,fromthetrainingdata,theprobability,p,ofobservingavaluenotintheconsequentisestimatedby:nrp/=(2)whereristhecardinalityoftheset,{x1,x2,…},intheconsequentandnisthenumberofrecords(tuples)thatsatisfytheruleduringtraining.
Thisprobabilityestimationofnovel(zerofrequency)eventsisfrom(WittenandBell1991).
Sincepestimatestheprobabilityofanovelevent,thelargerpis,thelessanomalousanoveleventis.
Hence,duringdetection,whenanoveleventisobserved,thedegreeofanomaly(anomalyscore)isestimatedby:rnpScoreAnomaly//1==(3)Anon-stationarymodelisassumedforLERAD–onlythelastoccurrenceofaneventisassumedimportant.
Sincenoveleventsareburstyinconjunctionwithattacks,afactortisintroduced–itisthetimeintervalsincethelastnovel(anomalous)attributevalue.
Whenanoveleventoccurredrecently(smallvalueoft),anoveleventismorelikelytooccuratthepresentmoment.
Hence,theanomalyscoreismeasuredbyt/p.
Sincearecordcandeviatefromtheconsequentofmorethanonerule,thetotalanomalyscoreofarecordisaggregatedoveralltherulesviolatedbythetupletocombinetheeffectfromviolationofmultiplerules:∑∑==rntptScoreAnomalyTotal//(4)Themoretheviolations,moresignificanttheanomalyis,andthehighertheanomalyscoreshouldbe.
Analarmisraisedifthetotalanomalyscoreisaboveathreshold.
TherulegenerationphaseofLERADcomprisesof4mainsteps:(i)Generateinitialruleset:TrainingsamplesarepickedupatrandomfromarandomsubsetSoftrainingexamples.
Candidaterules(asdepictedinEquation1)aregeneratedfromthesesamples.
(ii)Coveragetest:Therulesetisfilteredbyremovingrulesthatdonotcover/describeallthetrainingexamplesinS.
Ruleswithlowerrateofanomalies(lowerr/n)arekept.
(iii)UpdaterulesetbeyondS:Extendtherulesovertheremainingtrainingdatabyaddingvaluesfortheattributeintheconsequentwhentheantecedentistrue.
(iv)Validatetheruleset:Rulesareremovediftheyareviolatedbyanytupleinthevalidationset.
Sincesystemcallisthekey(pivotal)attributeinahostbasedsystem,wemodifiedLERADsuchthattheruleswereforcedtohaveasystemcallasaconditionintheantecedent.
Theonlyexceptionwemadewasthegenerationofruleswithnoantecedent.
SystemcallandargumentbasedrepresentationsWenowpresentthedifferentrepresentationsforLERAD.
Sequenceofsystemcalls:S-LERAD.
Usingsequenceofsystemcallsisaverypopularapproachforanomalydetection.
Weusedawindowoffixedlength6(asthisisclaimedtogivebestresultsinstideandt-stide)andfedthesesequencesofsixsystemcalltokensasinputtuplestoLERAD.
ThisrepresentationisselectedtoexplorewhetherLERADwouldbeabletocapturethecorrelationsamongsystemcallsinasequence.
Also,thisexperimentwouldassistusincomparingresultsbyusingthesamealgorithmforsystemcallsequencesaswellastheirarguments.
AsamplerulelearnedinaparticularrunofS-LERADis:}{,,3621munmapSCopenSCmmapSCcloseSC∈===(1/pvalue=455/1)Thisruleisanalogoustoencounteringcloseasthefirstsystemcall(representedasSC1),followedbymmapandmunmap,andopenasthesixthsystemcall(SC6)inawindowofsize6slidingacrosstheaudittrail.
Eachruleisassociatedwithann/rvalue.
Thenumber455inthenumeratorreferstothenumberoftraininginstancesthatcomplywiththerule(ninEquation3).
Thenumber1inthedenominatorimpliesthatthereexistsjustonedistinctvalueoftheconsequent(munmapinthiscase)whenalltheconditionsinthepremiseholdtrue(rinEquation3).
Argument-basedmodel:A-LERAD.
Weproposethatargumentandotherkeyattributeinformationisintegraltomodelingagoodhost-basedanomalydetectionsystem.
Weextractedarguments,returnvalueanderrorstatusofsystemcallsfromtheauditlogsandexaminedtheeffectsoflearningrulesbaseduponsystemcallsalongwiththeseattributes.
Anyvaluefortheotherarguments(giventhesystemcall)thatwasneverencounteredinthetrainingperiodforalongtimewouldraiseanalarm.
Weperformedexperimentsonthetrainingdatatomeasurethemaximumnumberofattributes(MAX)foreveryuniquesystemcall.
Wedidnotusethetestdatafortheseexperimentssothatwedonotgetanyinformationaboutitbeforeourmodelisbuilt.
SinceLERADacceptsthesame(fixed)numberofattributesforeverytuple,wehadtoinsertaNULLvalueforanattributethatwasabsentinaparticularsystemcall.
Theorderoftheattributeswithinthetuplewasmadesystemcalldependent.
SincewemodifiedLERADtoformrulesbaseduponthesystemcalls,thereisconsistencyamongsttheattributesforanyspecificsystemacrossallmodels.
Byincludingallattributesweutilizedthemaximumamountofinformationpossible.
Mergingsystemcallsequenceandargumentinformationofthecurrentsystemcall:M-LERAD.
Thefirstrepresentationwediscussedisbaseduponsequenceofsystemcalls;thesecondonetakesintoconsiderationotherrelevantattributes,whoseefficacyweclaiminthispaper;sofusingthetwotostudytheeffectswasanobviouschoice.
MergingisaccomplishedbyaddingmoreattributesineachtuplebeforeinputtoLERAD.
Eachtuplenowcomprisesofthesystemcall,MAXnumberofattributesforthecurrentsystemcall,andthepreviousfivesystemcalls.
Then/rvaluesobtainedfromtheallrulesviolatedareaggregatedintoananomalyscore,whichisthenusedtogenerateanalarmbaseduponthethreshold.
Mergingsystemcallsequenceandargumentinformationforallsystemcallsinthesequence:M*-LERAD.
Alltheproposedvariants,namelyS-LERAD,A-LERADandM-LERAD,considerasequenceof6systemcallsand/ortakeintotheargumentsforthecurrentsystemcall.
WeproposeanothervariantcalledmultipleargumentLERAD(M*-LERAD)–inadditiontousingthesystemcallsequenceandtheargumentsforthecurrentsystemcall,thetuplesnowalsocomprisetheargumentsforallsystemcallswithinthefixedlengthsequenceofsize6.
Eachtuplenowcomprisesofthecurrentsystemcall,MAXattributesforthecurrentsystemcall,5previoussystemcallsandMAXattributesforeachofthosesystemcalls.
ExperimentalEvaluationOurgoalistostudyifLERADcanbemodifiedtodetectattack-basedanomalieswithfeaturespacescomprisingsystemcallsandtheirarguments.
DatasetsandexperimentalprocedureWeusedthefollowingdatasetsforourexperiments:(i)The1999DARPAintrusiondetectionevaluationdataset:DevelopedattheMITLincolnLab,weselectedtheBSMlogsfromSolarishosttracingsystemcallsthatcontains33attacks.
Attackclassificationisprovidedin(Kendell1999).
Thefollowingapplicationswerechosen:ftpd,telnetd,sendmail,tcsh,login,ps,eject,fdformat,sh,quotaandufsdump,duetotheirvariedsizes(1500–over1millionsystemcalls).
Weexpectedtofindagoodmixofbenignandmaliciousbehaviorintheseapplications.
Trainingwasperformedonweek3dataandtestingonweeks4and5.
Anattackisconsideredtobedetectedifanalarmisraisedwithin60secondsofitsoccurrence(sameastheDARPAevaluation).
(ii)lpr,loginandpsapplicationsfromtheUniversityofNewMexico(UNM):Thelprapplicationcomprisedof2703normaltracescollectedfrom77hostsrunningSUNOS4.
1.
4attheMITAILab.
Another1001tracesresultfromtheexecutionofthelprcpattackscript.
TracesfromtheloginandpsapplicationswereobtainedfromLinuxmachinesrunningkernel2.
0.
35.
HomegrownTrojanprogramswereusedfortheattacktraces.
(iii)Microsoftexcelmacrosexecutions(FIT-UTKdata):Normalexcelmacroexecutionsareloggedin36distincttraces.
2malicioustracesmodifyregistrysettingsandexecutesomeotherapplication.
SuchabehaviorisexhibitedbytheILOVEYOUwormwhichopensthewebbrowsertoaspecifiedwebsiteandexecutesaprogram,modifyingregistrykeysandcorruptinguserfiles,resultinginadistributeddenialofservice(DDoS)attack.
TheinputtuplesforS-LERADwere6contiguoussystemcalls;forA-LERADtheyweresystemcallswiththeirreturnvalue,errorstatusandarguments;TheinputsforM-LERADweresequencesofsystemcallswithargumentsofthecurrentsystemcall;whereasinM*-LERAD,theyweresystemcallsequenceswithargumentsforallthe6systemcalls.
Fortide,theinputswereallthepairsofsystemcallswithinawindowoffixedsize6;stideandt-stidecomprisedallcontiguoussequencesoflength6.
Forallthetechniques,alarmsweremergedindecreasingorderoftheanomalyscoresandevaluatedatvariedfalsealarmrates.
ResultsSincet-stideissupposedtogivebestresultsamongthesequence-basedtechniques,wecompareditsperformancewithS-LERADontheUNMandFIT-UTKdatasets.
Table1:t-stidevs.
S-LERAD(UNM,FIT-UTKdata).
Numberofattacksdetected(Numberoffalsealarms)ProgramnameNumberoftrainingsequencesNumberoftestsequencest-stideS-LERADlpr100027041(0)1(1)ps12272(58)2(2)login881(0)1(1)excel3262(92)2(0)04812162000.
250.
512.
5Falsealarms(x10-3%perday)Numberofdetectionstidestidet-stideS-LERADA-LERADM-LERADM*-LERAD01234567DoSU2RR2LAttacktypesNumberofdetectionstidestidet-stideS-LERADA-LERADM-LERADM*-LERADFigure1.
Numberofdetections(DARPA/LLdata).
Figure2.
Numberofdetectionsat10falsealarmsperdayfordifferentattackcategories(DARPA/LLdata).
ResultsfromTable1showthatboththetechniqueswereabletodetectalltheattacks.
However,t-stidegeneratedmorefalsealarmsforpsandexcel.
WealsoperformedexperimentsontheDARPA/LLdatasetstoevaluateallthetechniques.
Figure1illustratesthetotalattacksdetected(Y-axis)atvariedfalsealarmsrates(X-axis).
Atzerofalsealarms,tide,stideandt-stidedetectedthemostattacks,suggestingthatmaximumdeviationsintemporalsequencesaretruerepresentationsofactualattacks.
Butasthethresholdisrelaxed,S-LERADoutperformedallthe3sequence-basedtechniques.
ThiscanbeattributedtothefactthatS-LERADisabletogeneralizewellandlearnstheimportantcorrelations.
TheUNMandFIT-UTKdatasetsdonothavecompleteargumentinformationtoevaluateLERADvariantsthatinvolvearguments.
FortheDARPA/LLdataset,A-LERADfaredbetterthanS-LERADandtheothersequence-basedtechniques(Figure1),suggestingthatargumentinformationismoreusefulthansequenceinformation.
Usingargumentscouldalsomakeasystemrobustagainstmimicryattackswhichevadesequence-basedsystems.
ItcanalsobeseenthattheA-LERADcurvecloselyfollowsthecurveforM-LERAD.
Thisimpliesthatthesequenceinformationisredundant;itdoesnotaddsubstantialinformationtowhatisalreadygatheredfromarguments.
M*-LERADperformedtheworstamongallthetechniquesatfalsealarmsratelowerthan0.
5x10-3%perday.
ThereasonforsuchaperformanceisthatM*-LERADgeneratedalarmsforbothsequenceandargumentbasedanomalies.
Ananomalousargumentinonesystemcallraisedanalarminsixdifferenttuples,leadingtoahigherfalsealarmrate.
Asthealarmthresholdwasrelaxed,thedetectionrateimproved.
ThebetterperformanceofLERADvariantscanbeattributedtoitsanomalyscoringfunction.
Itassociatesaprobabilisticscorewitheveryrule.
Insteadofabinary(present/absent)value(asinthecaseofstideandt-stide),thisprobabilityvalueisusedtocomputethedegreeofanomalousness.
Italsoincorporatesaparameterforthetimeelapsedsinceanovelvaluewasseenforanattribute.
Theadvantageistwofold:(i)itassistsindetectinglongtermanomalies;(ii)suppressesthegenerationofmultiplealarmsfornovelattributevaluesinasuddenburstofdata.
Figure2plotstheresultat10falsealarmsperday,makingatotalof100falsealarmsforthe10daysoftesting(criterionusedinthe1999DARPAevaluation).
DifferentattacktypesarerepresentedalongtheX-axisandtheY-axisdenotedthetotalattacksdetectedineachattackcategory.
M-LERADwasabletodetectthelargestnumberofattacks–5DoS,3U2Rand6R2Lattacks.
Aninterestingobservationisthatthesequence-basedtechniquesgenerallydetectedtheU2RattackswhereastheR2LandDoSattackswerebetterdetectedbytheargument-basedtechniques.
Ourtechniqueswereabletodetectsomepoorlydetectedattacksquotedin(Lippmannetal.
1999),warezclientbeingoneofthem.
Ourmodelsalsodetected3stealthypsattacks.
Table2.
A-LERADvs.
AC-LERAD(DARPA/LL).
NumberofdetectionsFalsealarmsperdayA-LERADAC-LERAD5109101311201716ExperimentswereperformedtoseeifNULLattributeshelpindetectinganomaliesoriftheyformedmeaninglessrules.
WeaddedaconstraintthattheNULLvaluescouldnotbeaddedtotheattributevaluesintherules.
WecallthisvariantAC-LERAD(A-LERADwithconstraint).
Table2summarizestheresults.
A-LERADwasabletodetectmoreattacksthantheconstrainedcounterpart,suggestingthatruleswithNULLvaluedattributesarebeneficialtothedetectionofanomaliescorrespondingtoattacks.
AnalysisofanomaliesAnanomalyisadeviationfromnormalcyand,bydefinition,doesnotnecessarilyidentifythenatureofanattack.
Anomalydetectionservesasanearlywarningsystem;humansneedtoinvestigateifananomalyactuallycorrespondstoamaliciousactivity.
Theanomaliesthatledtotheattacksdetectedbyargument-basedvariantsofLERAD,inmanycases,donotrepresentthetruenatureoftheattacks.
Instead,itmayberepresentativeofbehavioralpatternsresultingfromtheexecutionofsomeotherprogramaftertheintrudersuccessfullygainedaccesstothehost.
Forexample,aninstanceofguestattackisdetectedbyA-LERADnotbyobservingattemptsbythehackertryingtogainaccess,butbyencounteringnovelargumentstotheioctlsystemcallwhichwasexecutedbythehackertryingtoperformacontrolfunctiononaparticulardevice.
Astealthypsattackwasdetectedbyoursystemwhentheintrudertriedtochangeownerusinganovelgroupid.
Eveniftheanomalyisrelatedtotheattackitself,itmayreflectverylittleinformationabouttheattack.
Oursystemisabletolearnonlyapartialsignatureoftheattack.
Guessftpisdetectedbyabadpasswordforanillegitimateusertryingtogainaccess.
However,theattackercouldhavemadeinterspersedattemptstoevadethesystem.
Attackswerealsodetectedbycapturingerrorscommittedbytheintruder,possiblytoevadetheIDS.
Ftpwriteisavulnerabilitythatexploitsaconfigurationerrorwhereinaremoteftpuserisabletosuccessfullycreateandaddfiles(suchas.
rhost)andgainaccesstothesystem.
Thisattackisdetectedbymonitoringthesubsequentactionsoftheintruder,whereinheattemptstosettheauditstateusinganinvalidpreselectionmask.
Thisanomalywouldgounnoticedinasystemmonitoringonlysystemcalls.
Table3.
TopanomalousattributesforA-LERAD.
AttributecausingfalsealarmWhethersomeattackwasdetectedbythesameattributeioctlargumentYesioctlreturnvalueYessetegidmaskYesopenreturnvalueNoopenerrorstatusNofcntlerrorstatusNosetpgrpreturnvalueNoWere-emphasizethatourgoalistodetectanomalies,theunderlyingassumptionbeingthatanomaliesgenerallycorrespondtoattacks.
Sincenotallanomalouseventsaremalicious,weexpectfalsealarmstobegenerated.
Table3liststheattributesresponsibleforthegenerationofalarmsandwhethertheseresultedinactualdetectionsornot.
Itisobservedthatsomeanomalieswerepartofbenignapplicationbehavior.
Atotherinstances,theanomalousvalueforthesameattributewasresponsiblefordetectingactualmaliciousexecutionofprocesses.
Asanexample,manyattacksweredetectedbyobservingnovelargumentsfortheioctlsystemcall,butmanyfalsealarmswerealsogeneratedbythisattribute.
Eventhoughnotallnovelvaluescorrespondtoanyillegitimateactivity,argument-basedanomalieswereinstrumentalindetectingtheattacks.
TimeandspacerequirementsComparedtosequence-basedmethods,ourtechniquesextractandutilizemoreinformation(systemcallargumentsandotherattributes),makingitimperativetostudythefeasibilityofourtechniquesforonlineusage.
Fort-stide,allcontiguoussystemcallsequencesoflength6arestoredduringtraining.
ForA-LERAD,systemcallsequencesandotherattributesarestored.
Inboththecases,spacecomplexityisoftheorderofO(n),wherenisthetotalnumberofsystemcalls,thoughtheA-LERADrequirementismorebyaconstantfactorksinceitstoresadditionalargumentinformation.
Duringdetection,A-LERADusesonlyasmallsetofrules(intherange14-25fortheapplicationsusedinourexperiments).
t-stide,ontheotherhand,stillrequirestheentiredatabaseoffixedlengthsequencesduringtesting,whichincurlargerspaceoverheadduringdetection.
Weconductedexperimentsonthetcshapplication,whichcomprisesofover2millionsystemcallsintrainingandhasover7millionsystemcallsintestdata.
TherulesformedbyA-LERADrequirearound1KBspace,apartfromamappingtabletomapstringsandintegers.
Thememoryrequirementsforstoringasystemcallsequencedatabasefort-stidewereover5KBplusamappingtablebetweenstringsandintegers.
TheresultssuggestthatA-LERADhasbettermemoryrequirementsduringthedetectionphase.
Wereiteratethatthetrainingcanbedoneoffline.
Oncetherulesaregenerated,A-LERADcanbeusedtodoonlinetestingwithlowermemoryrequirements.
ThetimeoverheadincurredbyA-LERADandt-stideinourexperimentsisgiveninTable4.
TheCPUtimeshavebeenobtainedonaSunUltra5workstationwith256MBRAMand400MHzprocessorspeed.
ItcanbeinferredfromtheresultsthatA-LERADisslowerthant-stide.
Duringtraining,t-stideisamuchsimpleralgorithmandprocesseslessdatathanA-LERADforbuildingamodelandhencet-stidehasamuchshortertrainingtime.
Duringdetection,t-stidejustneedstocheckifasequenceispresentinthedatabase,whichcanbeefficientlyimplementedwithahashtable.
Ontheotherhand,A-LERADneedstocheckifarecordmatchesanyofthelearnedrules.
Also,A-LERADhastoprocessadditionalargumentinformation.
Run-timeperformanceofA-LERADcanbeimprovedwithmoreefficientrulematchingalgorithm.
Also,t-stidewillincursignificantlylargertimeoverheadwhenthestoredsequencesexceedthememorycapacityanddiskaccessesbecomeunavoidable–A-LERADdoesnotencounterthisproblemaseasilyast-stidesinceitwillstilluseasmallsetofrules.
Moreover,therun-timeoverheadofA-LERADisabouttensofsecondsfordaysofdata,whichisreasonableforpracticalpurposes.
Table4.
Executiontimecomparison.
ApplicationTrainingTime(seconds)[on1weekofdata]TestingTime(seconds)[on2weeksofdata]t-stideA-LERADt-stideA-LERADftpd0.
190.
900.
190.
89telnetd0.
967.
121.
059.
79ufsdump6.
7630.
040.
421.
66tcsh6.
3229.
565.
9129.
38login2.
4115.
122.
4515.
97sendmail2.
7314.
793.
2319.
63quota0.
203.
040.
203.
01sh0.
212.
980.
403.
93ConclusionsInthispaper,weportrayedtheefficacyofincorporatingsystemcallargumentinformationandusedarule-learningalgorithmtomodelahost-basedanomalydetectionsystem.
Baseduponexperimentsonvariousdatasets,weclaimthatourargument-basedmodel,A-LERAD,detectedmoreattacksthanallthesequence-basedtechniques.
Oursequence-basedvariant(S-LERAD)wasalsoabletogeneralizebetterthantheprevalentsequencebasedtechniques,whichrelyonpurememorization.
Mergingargumentandsequenceinformationcreatesarichermodelforanomalydetection,asillustratedbytheempiricalresultsofM-LERAD.
M*-LERADdetectedlessernumberofattacksatlowerfalsealarmratessinceeveryanomalousattributeresultsinalarmsbeingraisedin6successivetuples,leadingtoeithermultipledetectionsofthesameattack(countedasasingledetection)ormultiplefalsealarms(allseparateentities).
Resultsalsoindicatedthatsequence-basedmethodshelpdetectU2RattackswhereasR2LandDoSattackswerebetterdetectedbyargument-basedmodels.
Ourargument-basedtechniquesdetecteddifferenttypesofanomalies.
Someanomaliesdidnotrepresentthetruenatureoftheattack.
Someattacksweredetectedbysubsequentanomaloususerbehavior,liketryingtochangegroupownership.
Someotheranomaliesweredetectedbylearningonlyaportionoftheattack,whilesomeweredetectedbycapturingintrudererrors.
Thoughourtechniquesincurhighertimeoverheadduetothecomplexityofourtechniques(sincemoreinformationisprocessed)ascomparedtot-stide,theybuildmoresuccinctmodelsthatincurmuchlessspaceoverhead–ourtechniquesaimtogeneralizefromthetrainingdata,ratherthanpurememorization.
Moreover,3secondsperday(themostanapplicationtookduringtestingphase)isreasonableforonlinesystems,eventhoughitissignificantlylongerthant-stide.
Thoughourtechniquesdiddetectmoreattackswithfewerfalsealarms,therearisesaneedformoresophisticatedattributes.
Insteadofhavingafixedsequence,wecouldextendourmodelstoincorporatevariablelengthsub-sequencesofsystemcalls.
Eventheargument-basedmodelsareoffixedwindowsize,creatinganeedforamodelacceptingvariedargumentinformation.
Ourtechniquescanbeeasilyextendedtomonitoraudittrailsincontinuum.
Sincewemodeleachapplicationseparately,somedegreeofparallelismcanalsobeachievedtotestprocesssequencesastheyarebeinglogged.
ReferencesAgrawal,R.
;Imielinski,T.
;andSwamiA.
1993.
Miningassociationrulesbetweensetsofitemsinlargedatabases.
ACMSIGMOD,207-216.
Forrest,S.
;Hofmeyr,S.
;Somayaji,A.
;andLongstaff,T.
1996.
ASenseofSelfforUNIXProcesses.
IEEESymposiumonSecurityandPrivacy,120-128.
Kendell,K.
1999.
ADatabaseofComputerAttacksfortheEvaluationofIntrusionDetectionSystems.
MastersThesis,MIT.
Kruegel,C.
;Mutz,D.
;Valeur,F.
;andVigna,G.
2003.
OntheDetectionofAnomalousSystemCallArguments,EuropeanSymposiumonResearchinComputerSecurity,326-343.
Lane,T.
,andBrodleyC.
E.
1997.
SequenceMatchingandLearninginAnomalyDetectionforComputerSecurity.
AAAIWorkshoponAIApproachestoFraudDetectionandRiskManagement,43-49.
Lippmann,R.
;Haines,J.
;Fried,D.
;Korba,J.
;andDas,K.
2000.
The1999DARPAOff-LineIntrusionDetectionEvaluation.
ComputerNetworks,34:579-595.
Mahoney,M.
,andChan,P.
2003.
LearningRulesforAnomalyDetectionofHostileNetworkTraffic,IEEEInternationalConferenceonDataMining,601-604.
Rigoutsos,I.
,andFloratos,A.
1998.
Combinatorialpatterndiscoveryinbiologicalsequences.
Bioinformatics,14(1):55-67.
Tan,K.
M.
C.
;Killourhy,K.
S.
;andMaxion,R.
A.
2002.
UndermininganAnomaly-basedIntrusionDetectionSystemUsingCommonExploits.
RAID,54-74.
Wagner,D.
,andSoto,P.
2002.
MimicryAttacksonHost-BasedIntrusionDetectionSystems.
ACMCCS,255-264.
Warrender,C.
;Forrest,S.
;andPearlmutter,B.
1999.
DetectingIntrusionsUsingSystemCalls:AlternativeDataModels.
IEEESymposiumonSecurityandPrivacy,133-145.
Wespi,A.
;Dacier,M.
;andDebar,H.
1999.
AnIntrusion-DetectionSystemBasedontheTeiresiasPattern-DiscoveryAlgorithm.
EICARConference,1-15.
Wespi,A.
;Dacier,M.
;andDebar,H.
2000.
Intrusiondetectionusingvariable-lengthaudittrailpatterns.
RAID,110-129.
Witten,I.
,andBell,T.
1991.
Thezero-frequencyproblem:estimatingtheprobabilitiesofnoveleventsinadaptivetextcompression.
IEEETrans.
onInformationTheory,37(4):1085-1094.

美国云服务器 1核 1G 100M 10G防御 39元/月 物语云计算

物语云计算(MonogatariCloud)是一家成立于2016年的老牌国人商家,主营国内游戏高防独服业务,拥有多家机房资源,产品质量过硬,颇有一定口碑。本次带来的是美国圣何塞 Equinix 机房的高性能I9-10980XE大带宽VPS,去程CN2GIA回程AS9929,美国原生IP,支持解锁奈飞等应用,支持免费安装Windows系统。值得注意的是,物语云采用的虚拟化技术为Hyper-V,资源全...

RackNerd美国大硬盘服务器促销:120G SSD+192TB HDD,1Gbps大带宽,月付$599,促销美国月付$服务器促销带宽

racknerd怎么样?racknerd最近发布了一些便宜美国服务器促销,包括大硬盘服务器,提供120G SSD+192TB HDD,有AMD和Intel两个选择,默认32G内存,1Gbps带宽,每个月100TB流量,5个IP地址,月付$599。价格非常便宜,需要存储服务器的朋友可以关注一下。RackNerd主要经营美国圣何塞、洛杉矶、达拉斯、芝加哥、亚特兰大、新泽西机房基于KVM虚拟化的VPS、...

ManSora:英国CN2 VPS,1核/1GB内存/10GB SSD/1TB流量/100Mbps/KVM,$18.2/月

mansora怎么样?mansora是一家国人商家,主要提供沪韩IEPL、沪日IEPL、深港IEPL等专线VPS。现在新推出了英国CN2 KVM VPS,线路为AS4809 AS9929,可解锁 Netflix,并有永久8折优惠。英国CN2 VPS,$18.2/月/1GB内存/10GB SSD空间/1TB流量/100Mbps端口/KVM,有需要的可以关注一下。点击进入:mansora官方网站地址m...

warez为你推荐
云主机租用云服务器(云主机)租用一年多少钱域名价格什么是域名的商业价值??服务器空间租用网站服务器是租用好,还是购买服务器好,还是购买空间好..asp主机空间Asp空间是什么空间啊?跟有的网站提供的免费空间有什么区别吗?域名主机域名和主机IP地址有什么关系美国vps主机我用的美国VPS主机429元/月,感觉好贵,请问有比较便宜点的吗?便宜的虚拟主机低价虚拟主机那种类型的好呢?海外域名外贸网站如何选择合适的海外域名?重庆虚拟空间重庆顺丰快递运的电脑主机19号中午11点到的第二天物流状态还是在重庆集散中心?今天能不能领导件?北京网站空间网站空间哪里的好,
国外域名 腾讯云盘 virpus 外国域名 英语简历模板word godaddy域名转出 debian源 免费个人空间申请 anylink 赞助 静态空间 中国电信宽带测速网 江苏双线服务器 smtp服务器地址 中国域名 广东服务器托管 hdsky japanese50m咸熟 sonya 时间服务器 更多