medianlinuxcp

linuxcp  时间:2021-04-10  阅读:()
Chapter21DATACORPORAFORDIGITALFORENSICSEDUCATIONANDRESEARCHYorkYannikos,LukasGraner,MartinSteinebach,andChristianWinterAbstractDatacorporaareveryimportantfordigitalforensicseducationandre-search.
Severalcorporaareavailabletoacademia;theserangefromsmallmanually-createddatasetsofafewmegabytestomanyterabytesofreal-worlddata.
However,dierentcorporaaresuitedtodierentforensictasks.
Forexample,realdatacorporaareoftendesirablefortestingforensictoolpropertiessuchaseectivenessandeciency,butthesecorporatypicallylackthegroundtruththatisvitaltoperform-ingproperevaluations.
Syntheticdatacorporacansupporttooldevel-opmentandtesting,butonlyifthemethodologiesforgeneratingthecorporaguaranteedatawithrealisticproperties.
Thispaperpresentsanoverviewoftheavailabledigitalforensiccor-poraanddiscussestheproblemsthatmayarisewhenworkingwithspeciccorpora.
Thepaperalsodescribesaframeworkforgeneratingsyntheticcorporaforeducationandresearchwhensuitablereal-worlddataisnotavailable.
Keywords:Forensicdatacorpora,syntheticdiskimages,model-basedsimulation1.
IntroductionAdigitalforensicinvestigatormusthaveabroadknowledgeofforensicmethodologiesandexperiencewithawiderangeoftools.
Thisincludesmulti-purposeforensicsuiteswithadvancedfunctionalityandgoodus-abilityaswellassmalltoolsforspecialtasksthatmayhavemoderatetolowusability.
Gainingexpert-levelskillsintheoperationofforensictoolsrequiresasubstantialamountoftime.
Additionally,advancesinanalysismethods,toolsandtechnologiesrequirecontinuouslearningtomaintaincurrency.
G.
PetersonandS.
Shenoi(Eds.
):AdvancesinDigitalForensicsX,IFIPAICT433,pp.
309–325,2014.
cIFIPInternationalFederationforInformationProcessing2014310ADVANCESINDIGITALFORENSICSXIndigitalforensicseducation,itisimportanttoprovideinsightsintospecictechnologiesandhowforensicmethodsmustbeappliedtoper-formthoroughandsoundanalyses.
Itisalsoveryimportanttoprovidearichlearningenvironmentwherestudentscanuseforensictoolstorigorouslyanalyzesuitabletestdata.
Thesameistrueindigitalforensicsresearch.
Newmethodologiesandnewtoolshavetobetestedagainstwell-knowndatacorpora.
Thisprovidesabasisforcomparingmethodologiesandtoolssothatthead-vantagesandshortcomingscanbeidentied.
Forensicinvestigatorscanusetheresultsofsuchevaluationstomakeinformeddecisionsaboutthemethodologiesandtoolsthatshouldbeusedforspecictasks.
Thishelpsincreasetheeciencyandthequalityofforensicexaminationswhileallowingobjectiveevaluationsbythirdparties.
Thepaperprovidesanoverviewofseveralreal-worldandsyntheticdatacorporathatareavailablefordigitalforensicseducationandre-search.
Also,ithighlightsthepotentialrisksandproblemsencounteredwhenusingdatacorpora,alongwiththecapabilitiesofexistingtoolsthatallowthegenerationofsyntheticdatacorporawhenreal-worlddataisnotavailable.
Additionally,thepaperdescribesacustomframeworkforsyntheticdatagenerationandevaluatestheperformanceoftheframe-work.
2.
AvailableDataCorporaSeveraldatacorporahavebeenmadeavailableforpublicuse.
Whilesomeofthecorporaareusefulfordigitalforensicseducationandre-search,othersaresuitedtoveryspecicareassuchasnetworkforensicsandforensiclinguistics.
Thissectionpresentsanoverviewofthemostrelevantcorpora.
2.
1RealDataCorpusAfewreal-worlddatacorporaareavailabletosupportdigitalforen-sicseducationandresearch.
Garnkel,etal.
[7]havecreatedtheRealDataCorpusfromusedharddisksthatwerepurchasedfromaroundtheworld.
Inalaterwork,Garnkel[5]describedthechallengesandlessonslearnedwhilehandlingtheRealDataCorpus,whichbythenhadgrowntomorethan30terabytes[5].
AsofSeptember2013,theRealDataCorpusincorporated1,289harddiskimages,643ashmemoryimagesand98opticaldiscs.
However,becausethiscorpuswaspartlyfundedbytheU.
S.
Government,accesstothecorpusrequirestheapprovalofaninstitutionalreviewboardinaccordancewithU.
S.
legislation.
Ad-Yannikos,Graner,Steinebach&Winter311ditionalinformationaboutthecorpusanditsaccessrequirementsareavailableat[6].
Asmallercorpus,whichincludesspecicscenarioscreatedforeduca-tionalpurposes[25],canbedownloadedwithoutanyrestrictions.
Thissmallercorpuscontains:Threetestdiskimagescreatedespeciallyforeducationalandtest-ingpurposes(e.
g.
,lesystemanalysis,lecarvingandhandlingencodings).
FourrealisticdiskimagesetscreatedfromUSBmemorysticks,adigitalcameraandaWindowsXPcomputer.
Asetofalmost1,000,000les,including109,282JPEGles.
Fivephoneimagesfromfourdierentcellphonemodels.
Mixeddatacorrespondingtothreectionalscenariosforeduca-tionalpurposes,includingmultiplenetworkpacketdumpsanddiskimages.
Duetothevarietyofdataitcontains,theRealDataCorpusisavaluableresourceforeducatorsandresearchersintheareasofmulti-mediaforensics,mobilephoneforensicsandnetworkforensics.
Toourknowledge,itisthelargestpublicly-availablecorpusintheareaofdigitalforensics.
2.
2DARPAIntrusionDetectionDataSetsIn1998and1999,researchersatMITLincolnLaboratory[12,13]createdasimulationnetworkinordertoproducenetworktracandauditlogsforevaluatingintrusiondetectionsystems.
Thesimulatedin-frastructurewasattackedusingwell-knowntechniquesaswellasnewtechniquesthatwerespeciallydevelopedfortheevaluation.
In2000,additionalexperimentswereperformedinvolvingspecicscenarios,in-cludingtwoDDoSattacksandanattackonaWindowsNTsystem.
Thedatasetsforallthreeexperimentsareavailableat[11];theyin-cludenetworktracdataintcpdumpformat,auditlogsandlesystemsnapshots.
Themethodologiesemployedinthe1998and1999evaluationswerecriticizedbyMcHugh[16].
McHughstatesthattheevaluationresultsmissimportantdetailsandthatportionsoftheevaluationproceduresareunclearorinappropriate.
Additionally,Garnkel[4]pointsoutthatthedatasetsdonotrepresentreal-worldtracbecausetheylackcomplexityandheterogeneity.
Therefore,thiscorpushaslimiteduseinnetworkforensicsresearch.
312ADVANCESINDIGITALFORENSICSX2.
3MemCorpCorpusTheMemCorpCorpus[22]containsmemoryimagescreatedfromsev-eralvirtualandphysicalmachines.
Inparticular,thecorpuscontainsimagesextractedfrom87computersystemsrunningvariousversionsofMicrosoftWindows;theimageswereextractedusingcommonmemoryimagingtools.
Thecorpusincludesthefollowingimages:53systemmemoryimagescreatedfromvirtualmachines.
23systemmemoryimagescreatedfromphysicalmachineswithfactorydefaultcongurations(i.
e.
,withnoadditionalsoftwarein-stalled).
11systemmemoryimagescreatedfrommachinesunderspecicscenarios(e.
g.
,aftermalwarewasinstalled).
Thiscorpussupportseducationandtrainingeortsfocusedonmem-oryanalysisusingtoolssuchastheVolatileFramework[23].
However,asnotedbythecorpuscreator[22],thecorpusdoesnotcontainimagescreatedfromreal-worldsystemsorimagesfromoperatingsystemsotherthanMicrosoftWindows,whichreducesitsapplicability.
ThecreatoroftheMemCorpCorpusprovidesaccesstotheimagesuponrequest.
2.
4MORPHCorpusSeveralcorporahavebeencreatedintheareaoffacerecognition[8].
Sincealargecorpuswithfacialimagestaggedwithageinformationwouldbeveryusefulformultimediaforensics,wehavepickedasamplecorpusthatcouldbeavaluableresourceforresearch(e.
g.
,fordetectingofillegalmultimediacontentlikechildpornography).
TheMORPHCorpus[20]comprises55,000uniquefacialimagesofmorethan13,000individuals.
Theagesoftheindividualsrangefrom16to77withamedianageof33.
Fourimagesonaverageweretakenofeachindividualwithanaveragetimeof164daysbetweeneachimage.
Facialimagesannotatedwithageinformationareusefulfordevelop-ingautomatedagedetectionsystems.
Currently,noreliablemethods(i.
e.
,withlowerrorrates)existforageidentication.
Steinebach,etal.
[21]haveemployedfacerecognitiontechniquestoidentifyknownil-legalmultimediacontent,buttheydidnotconsiderageclassication.
2.
5EnronCorpusTheEnronCorpusintroducedin2004isawell-knowncorpusintheareaofforensiclinguistics[9].
Initsrawform,thecorpuscontainsYannikos,Graner,Steinebach&Winter313619,446emailmessagesfrom158executivesofEnronCorporation;theemailmessageswereseizedduringtheinvestigationofthe2001Enronscandal.
Afterdatacleansing,thecorpuscontains200,399messages.
TheEnronCorpusisoneofthemostreferencedmasscollectionsofreal-worldemaildatathatispubliclyavailable.
Thecorpusprovidesavaluablebasisforresearchonemailclassi-cation,animportantareainforensiclinguistics.
KlimtandYang[10]suggestusingthreadmembershipdetectionforemailclassicationandprovidetheresultsofbaselineexperimentsconductedwiththeEnronCorpus.
DatasetsfromtheEnronCorpusareavailableat[3].
2.
6GlobalIntelligenceFilesInFebruary2012,WikiLeaksstartedpublishingtheGlobalIntelli-genceFiles,alargecorpusofemailmessagesgatheredfromthein-telligencecompanyStratfor.
WikiLeaksclaimstopossessmorethan5,000,000emailmessagesdatedbetweenJuly2004andDecember2011.
AsofSeptember2013,almost3,000,000ofthesemessageshavebeenavailablefordownloadbythepublic[24].
WikiLeakscontinuestore-leasenewemailmessagesfromthecorpusonanalmostdailybasis.
LiketheEnronCorpus,theGlobalIntelligenceFileswouldprovideavaluablebasisforresearchinforensiclinguistics.
However,wearenotawareofanysignicantresearchconductedusingtheGlobalIntelligenceFiles.
2.
7ComputerForensicReferenceDataSetsTheComputerForensicReferenceDataSetsmaintainedbyNIST[19]isasmalldatacorpuscreatedfortrainingandtestingpurposes.
Thedatasetsincludetestcasesforlecarving,systemmemoryanalysisandstringsearchusingdierentencodings.
Thecorpuscontainsthefollowingdata:Onehackingcasescenario.
Twoimagesforunicodestringsearches.
Fourimagesforlesystemanalysis.
Oneimageformobiledeviceanalysis.
Oneimageforsystemmemoryanalysis.
Twoimagesforverifyingtheresultsofforensicimagingtools.
Thiscorpusprovidesasmallbutvaluablereferencesetfortooldevel-opers.
Itisalsosuitablefortraininginforensicanalysismethods.
314ADVANCESINDIGITALFORENSICSX3.
PitfallsofDataCorporaForensiccorporaareveryusefulforeducationandresearch,buttheyhavecertainpitfalls.
SolutionSpecicity:Whileacorpusisveryvaluablewhende-velopingmethodologiesandtoolsthatsolveresearchproblemsindigitalforensics,itisdiculttondgeneralsolutionsthatarenotsomehowtailoredtothecorpus.
Evenwhenasolutionisintendedtoworkingeneral(withdierentcorporaandintherealworld),researchanddevelopmenteortsoftenslowlyadaptthesolutiontothecorpusovertime,probablywithoutevenbeingnoticedbytheresearchers.
Forexample,theEnronCorpusiswidelyusedbytheforensicslinguisticscommunityasasinglebasisforresearchonemailclassication.
Itwouldbediculttoshowthattheresearchresultsbasedonthiscorpusapplytogeneralemailclassicationproblems.
Thiscouldalsobecomeanissueif,forinstance,ageneralmethod-ologyortoolthatsolvesaspecicproblemalreadyexists,andanotherresearchgroupisworkingtoenhancethesolution.
Usingonlyonecorpusduringdevelopmentincreasestheriskofcraftingasolutionthatmaybemoreeectiveandecientthanprevioussolutions,butonlywhenusedwiththatspeciccorpus.
LegalIssues:ThedataincorporasuchasGarnkel'sRealDataCorpuscreatedfromusedharddisksboughtfromthesecondarymarketmaybesubjecttointellectualpropertyandpersonalpri-vacylaws.
Evenifthecountrythathoststhereal-worldcorpusallowsitsuseforresearch,legalrestrictionscouldbeimposedbyasecondcountryinwhichtheresearchthatusesthecorpusisbeingconducted.
Theworstcaseiswhenlocallawscompletelyprohibittheuseofthecorpus.
Relevance:Datacorporaareoftencreatedassnapshotsofaspe-cicscenariosorenvironments.
Thedatacontainedincorporaoftenlosesitsrelevanceasitages.
Forexample,networktracfromthe1990sisquitedierentfromcurrentnetworktrac–afactthatwaspointedoutfortheDARPAIntrusionDetectionDataSets[4,16].
Anotherexampleisadatacorpuscontainingdataex-tractedfrommobilephones.
Suchacorpusmustbeupdatedveryfrequentlywithdatafromthelatestdevicesifitistobeusefulformobilephoneforensics.
Yannikos,Graner,Steinebach&Winter315ScenarioModelSyntheticDataSimulationPurposeFigure1.
Generatingsyntheticdatabasedonareal-worldscenario.
Transferability:Manydatacorporaarecreatedortakenfromspeciclocalenvironments.
TheemailmessagesintheEnronCor-pusareinEnglish.
Whilethiscorpusisvaluabletoforensiclin-guistsinEnglish-speakingcountries,itsvaluetoresearchersfo-cusedonotherlanguagesisdebatable.
Indeed,manyimportantpropertiesthatarerelevanttoEnglishandusedforemailclassi-cationmaynotbeapplicabletoArabicorMandarinChinese.
Likewise,corporadevelopedfortestingforensictoolsthatana-lyzespecicapplications(e.
g.
,instantmessagingsoftwareandchatclients)maynotbeusefulinothercountriesbecauseofdierencesinjargonandcommunicationpatterns.
Also,acorpusthatmostlyincludesFacebookpostsandIRClogsmaynotbeofmuchvalueinacountrywheretheseservicesarenotpopular.
4.
SyntheticDataCorpusGenerationAsidefrommethodologiesforcreatingsyntheticdatacorporabyman-uallyreproducingreal-worldactions,littleresearchhasbeendonerelatedtotool-supportedsyntheticdatacorpusgeneration.
MochandFreil-ing[17]havedevelopedForensig2,atoolthatgeneratessyntheticdiskimagesusingvirtualmachines.
Whiletheprocessforgeneratingdiskimageshastobeprogrammedinadvance,thetoolallowsrandomnesstobeintroducedinordertocreatesimilar,butnotidentical,diskimages.
Inamorerecentwork,MochandFreiling[18]presenttheresultsofanevaluationofForensig2appliedtostudenteducationscenarios.
Amethodologyforgeneratingasyntheticdatacorpusforforensicac-countingisproposedin[14]andevaluatedin[15].
Theauthorsdemon-stratehowtogeneratesyntheticdatacontainingfraudulentactivitiesfromsmallercollectionsofreal-worlddata.
Thedataisthenusedfortrainingandtestingafrauddetectionsystem.
5.
CorpusGenerationProcessThissectiondescribestheprocessforgeneratingasyntheticdatacor-pususingthemodel-basedframeworkpresentedin[27].
Figure1presentsthesyntheticdatagenerationprocess.
Therststepingeneratingasyntheticdatacorpusistodenethedatausecases.
For316ADVANCESINDIGITALFORENSICSXexample,inadigitalforensicsclass,wherestudentswillbetestedontheirknowledgeaboutharddiskanalysis,oneormoresuitablediskimageswouldberequiredforeachstudent.
ThestudentswouldhavetosearchthediskimagesfortracesofmalwareorrecovermultimediadatafragmentsusingtoolssuchasForemost[1]andSleuthKit[2].
Thediskimagescouldbecreatedinareasonableamountoftimeman-uallyorviascripting.
However,ifeverystudentshouldreceivedierentdiskimagesforanalysis,thensignicanteortmayhavetobeexpendedtoinsertvariationsintheimages.
Also,ifdierenttasksareassignedtodierentstudents(e.
g.
,onestudentshouldrecoverJPEGlesandanotherstudentshouldsearchfortracesofarootkit),moresignicantvariationswouldhavetobeincorporatedinthediskimages.
Thesecondstepinthecorpusgenerationprocessistospecifyareal-worldscenarioinwhichtherequiredkindofdataistypicallycreated.
Oneexampleisacomputerthatisusedbymultipleindividuals,whotypicallyinstallandremovesoftware,anddownload,copy,deleteandoverwriteles.
Thethirdstepistocreateamodeltomatchthisscenarioandserveasthebasisofasimulation,whichisthelaststep.
AMarkovchainconsistingofstatesandstatetransitionscanbecreatedtomodeluserbehavior.
Thestatescorrespondtotheactionsperformedbytheusersandthetransitionsspecifytheactionsthatcanbeperformedaftertheprecedingactions.
5.
1ScenarioModelingusingMarkovChainsFinitediscrete-timeMarkovchainsasdescribedin[26]areusedforsyntheticdatageneration.
OneMarkovchainiscreatedforeachtypeofsubjectwhoseactionsaretobesimulated.
Asubjectcorrespondstoauserwhoperformsactionsonaharddisksuchassoftwareinstallationsandledeletions.
ThestatesintheMarkovchaincorrespondtotheactionsperformedbythesubjectinthescenario.
Inordertoconstructasuitablemodel,itisnecessarytorstde-nealltheactions(states)thatcausedatatobecreatedanddeleted.
Thetransitionsbetweenactionsarethendened.
Followingthis,theprobabilityofeachactionisspecied(stateprobability)alongwiththeprobabilityofeachtransitionbetweentwoactions(transitionprobabil-ity);theprobabilitiesareusedduringtheMarkovchainsimulationtogeneraterealisticdata.
Thecomputationoffeasibletransitionproba-bilitiesgivenstateprobabilitiescaninvolvesomeeort,buttheprocesshasbeensimpliedin[28].
Yannikos,Graner,Steinebach&Winter317Next,thenumberofsubjectswhoperformtheactionsarespecied(e.
g.
,numberofindividualswhosharethecomputer).
Finally,thedetailsofeachpossibleactionarespecied(e.
g.
,whatexactlyhappensduringadownloadleactionoradeleteleaction).
5.
2Model-BasedSimulationHavingconstructedamodelofthedesiredreal-worldscenario,itisnecessarytoconductasimulationbasedonthemodel.
Thenumberofactionstobeperformedbyeachuserisspeciedandthesimulationisthenstarted.
Attheendofthesimulation,thediskimagecontainssyntheticdatacorrespondingtothemodeledreal-worldscenario.
5.
3SampleScenarioandModelTodemonstratethesyntheticdatagenerationprocess,weconsiderasamplescenario.
Thepurposeforgeneratingthesyntheticdataistotesthowdierentlecarversdealwithfragmenteddata.
Thereal-worldscenarioinvolvesanindividualwhousesanUSBmemorysticktotransferlargeamountsofles,mainlyphotographs,betweencomputers.
Inthefollowing,wedeneallthecomponentsinamodelthatwouldfacilitatethecreationofasyntheticdiskimageofaUSBmemorystickcontainingalargenumberofles,deletedlesandlefragments.
Theresultingdiskimagewouldbeusedtotesttheabilityoflecarverstoreconstructfragmenteddata.
States:Inthesamplemodel,thefollowingfouractionsaredenedasMarkovchainstates:1.
AddDocumentFile:Thisactionaddsadocumentle(e.
g.
,PDForDOC)tothelesystemofthesyntheticdiskimage.
ItisequivalenttocopyingalefromoneharddisktoanotherusingtheLinuxcpcommand.
2.
AddImageFile:Thisactionaddsanimagele(e.
g.
,JPEG,PNGorGIF)tothelesystem.
Again,itisequivalenttousingtheLinuxcpcommand.
3.
WriteFragmentedData:Thisactiontakesarandomimagele,cutsitintomultiplefragmentsandwritesthefragmentstothediskimage,ignoringthelesystem.
ItisequivalenttousingtheLinuxddforeachlefragment.
4.
DeleteFile:Thisactionremovesarandomlefromthelesystem.
ItisequivalenttousingtheLinuxrmcommand.
318ADVANCESINDIGITALFORENSICSX3124Figure2.
Markovchainusedtogenerateasyntheticdiskimage.
Transitions:Next,thetransitionsbetweentheactionsarede-ned.
Sincethetransitionsarenotreallyimportantinthescenario,theMarkovchainissimplyconstructedasacompletedigraph(Fig-ure2).
ThestatenumbersintheMarkovchaincorrespondtothestatenumbersspeciedabove.
StateProbabilities:Next,theprobabilityπiofeachaction(state)itobeperformedduringaMarkovchainsimulationisspecied.
Wechosethefollowingprobabilitiesfortheactionstoensurethatalargenumberoflesandlefragmentsareaddedtothesyntheticdiskimageandonlyamaximumofabouthalfoftheaddedlesaredeleted:π=(π1,π4)=(0.
2,0.
2,0.
4,0.
2).
StateTransitionProbabilities:Finally,thefeasibleprobabil-itiesforthetransitionsbetweentheactionsarecomputed.
Theframeworkisdesignedtocomputethetransitionprobabilitiesau-tomatically.
Onepossibleresultisthesimplesetoftransitionprobabilitiesspeciedinthematrix:P=0.
20.
20.
40.
20.
20.
20.
40.
20.
20.
20.
40.
20.
20.
20.
40.
2wherepijdenotestheprobabilityofatransitionfromactionitoactionj.
6.
CorporaGenerationFrameworkTheframeworkdevelopedforgeneratingsyntheticdiskimagesisim-plementedinJava1.
7.
ItusesamodulardesignwithasmallsetofcoreYannikos,Graner,Steinebach&Winter319Figure3.
Screenshotofthemodelbuilder.
components,agraphicaluserinterface(GUI)andmodulesthatprovidespecicfunctionality.
TheGUIprovidesamodelbuildinginterfacethatallowsamodeltobecreatedquicklyforaspecicscenariousingtheactionsavailableintheframework.
Additionally,animageviewerisimplementedtoprovidedetailedviewsofthegeneratedsyntheticdiskimages.
Newactionsintheframeworkcanbeaddedbyimplementingasmallnumberofinterfacesthatrequireminimalprogrammingeort.
Sincetheframeworksupportsthespecicationandexecutionofanabstractsyntheticdatagenerationprocess,newactionscanbeimplementedinde-pendentlyofascenarioforwhichasyntheticdiskimageisbeingcreated.
Forexample,itispossibletoworkonacompletelydierentscenariowherenancialdataistobecreatedinanenterpriserelationshipman-agementsystem.
Thecorrespondingactionsthatarerelevanttocreatingthenancialdatacanbeimplementedinastraightforwardmatter.
ThescreenshotinFigure3showsthemodelbuildercomponentoftheframework.
TheMarkovchainusedforgeneratingdatacorrespondingtothesamplescenarioisshowninthecenterofthegure(greenbox).
7.
FrameworkEvaluationThissectionevaluatestheperformanceoftheframework.
Thesamplemodeldescribedaboveisexecutedtosimulateacomputeruserwhoper-formswriteanddeleteactionsonaUSBmemorystick.
Theevaluationsetupisasfollows:Model:DescribedinSection5.
3.
320ADVANCESINDIGITALFORENSICSXDiscreteSimulationSteps:4,000actions.
SyntheticDiskImageSize:2,048MiB(USBmemorystick).
Filesystem:FAT32with4,096-byteclustersize.
AddDocumentFileAction:Adocument(e.
g.
,DOC,PDForTXT)leisrandomlycopiedfromalocallesourcecontaining139documentles.
AddImageFileAction:Animage(e.
g.
,PNG,JPEGorGIF)leisrandomlycopiedfromalocallesourcecontaining752imageles.
DeleteFileAction:Aleisrandomlychosenanddeletedfromthelesystemofthesyntheticdiskimagewithoutoverwriting.
WriteFragmentedDataAction:Animageleisrandomlychosenfromthelocallesourcecontaining752imageles.
Theleiswrittentothelesystemofthesyntheticdiskimageusingarandomnumberoffragmentsbetween2and20,arandomfragmentsizecorrespondingtoamultipleofthelesystemclustersizeandrandomly-selectedcluster-alignedlocationsforfragmentinsertion.
Twentysimulationsofthemodelwereexecutedusingthesetup.
Aftereachrun,thetimeneededtocompletelygeneratethesyntheticdiskimagewasassessed,alongwiththeamountofdiskspaceused,numberoflesdeleted,numberoflesstillavailableinthelesystemandnumberofdierentlefragmentswrittentotheimage.
Figure4(a)showsthetimerequiredbyframeworktoruneachsim-ulation.
Ontheaverage,asimulationrunwascompletedin2minutesand21seconds.
Figure4(b)presentsanoverviewofthenumbersoflesthatwereallocatedinanddeletedfromthesyntheticdiskimages.
Notethattheallocated(created)lesareshowninlightgraywhilethedeletedlesareshownindarkgray;theaveragevalueisshownasagrayline.
Ontheaverage,adiskimagecontained792allocatedlesand803deletedles,whichareexpectedduetotheprobabilitieschosenfortheactionsinthemodel.
Figure5(a)showstheuseddiskspaceinthesyntheticimagecor-respondingtoallocatedles(lightgray),deletedles(gray)andlefragments(darkgray).
Theusedspacediersconsiderablyoverthesimulationrunsbecauseonlythenumbersoflestobewrittenanddeletedfromthediskimageweredened(individuallesizeswerenotspecied).
SincetheleswerechosenrandomlyduringthesimulationYannikos,Graner,Steinebach&Winter3211234567891011121314151617181920050100150200128137131135154165143151134131136135156123150147161118142151SimulationRun(a)Timerequiredforeachsimulationrun.
1,0005000906NumberofFilesSimulationRun1234567891011121314151617181920749902742753832854767808795791797808833797807759816778801706845782782728818854777714841770825861772742811770827786816(b)Numbersofallocatedlesanddeletedles.
Figure4.
Evaluationresultsfor20simulationruns.
runs,thelesizesand,therefore,thediskspaceusagedier.
Ontheaverage,57%oftheavailablediskspacewasused.
Figure5(b)showstheaveragenumberoflefragmentsperletypeoverall20simulationruns.
Thewritingoffragmenteddatausedadedicatedlesourcecontainingonlypictures;thisexplainsthelargenumbersofJPEGandPNGfragments.
Figure6showsascreenshotoftheimageviewerprovidedbytheframe-work.
Informationsuchasthedatatype,fragmentsizeandlesystemstatus(allocatedanddeleted)isprovidedforeachblock.
8.
ConclusionsTheframeworkpresentedinthispaperiswell-suitedtoscenario-basedmodelbuildingandsyntheticdatageneration.
Inparticular,itprovidesaexibleandecientapproachforgeneratingsyntheticdatacorpora.
The322ADVANCESINDIGITALFORENSICSX34.
7784.
3854.
0252.
06UnusedDiskSpace(%)71.
6274.
7351.
5262.
9968.
8546.
4743.
2759.
9759.
8935.
7158.
3264.
1161.
1239.
3647.
4167.
90100500(a)Useddiskspacecorrespondingtoallocatedles,deletedlesandlefragments.
bmpepsgifjpgmovmp4pdfpngsvgtifzip10210310428601514866112492242FileType13,5463,061(b)Averagenumberoffragmentsperletype.
Figure5.
Evaluationresultsfor20simulationruns.
experimentalevaluationofcreatingasyntheticdiskimagefortestingthefragmentrecoveryperformanceoflecarversdemonstratestheutilityfortheframework.
Unlikereal-worldcorpora,syntheticcorporaprovidegroundtruthdatathatisveryimportantindigitalforensicseducationandresearch.
Thisenablesstudentsaswellasdevelopersandtesterstoacquiredetailedunderstandingofthecapabilitiesandperformanceofdigitalforensictools.
Theabilityoftheframeworktogeneratesyntheticcorporabasedonrealisticscenarioscansatisfytheneedfortestdatainapplicationsforwhichsuitablereal-worlddatacorporaarenotavailable.
Moreover,theframeworkisgenericenoughtoproducesyntheticcorporaforavarietyofdomains,includingforensicaccountingandnetworkforensics.
Yannikos,Graner,Steinebach&Winter323Figure6.
Screenshotoftheimageviewer.
AcknowledgementThisresearchwassupportedbytheCenterforAdvancedSecurityResearchDarmstadt(CASED).
References[1]AirForceOceofSpecialInvestigations,Foremost(foremost.
sourceforge.
net),2001.
[2]B.
Carrier,TheSleuthKit(www.
sleuthkit.
org/sleuthkit),2013.
[3]W.
Cohen,EnronEmailDataset,SchoolofComputerScience,CarnegieMellonUniversity,Pittsburgh,Pennsylvania(www.
cs.
cmu.
edu/~enron),2009.
[4]S.
Garnkel,Forensiccorpora,achallengeforforensicresearch,un-publishedmanuscript,2007.
[5]S.
Garnkel,Lessonslearnedwritingdigitalforensicstoolsandman-aginga30TBdigitalevidencecorpus,DigitalInvestigation,vol.
9(S),pp.
S80–S89,2012.
[6]S.
Garnkel,DigitalCorpora(digitalcorpora.
org),2013.
[7]S.
Garnkel,P.
Farrell,V.
RoussevandG.
Dinolt,Bringingsci-encetodigitalforensicswithstandardizedforensiccorpora,DigitalInvestigation,vol.
6(S),pp.
S2–S11,2009.
[8]M.
GrgicandK.
Delac,FaceRecognitionHomepage,Zagreb,Croa-tia(www.
face-rec.
org/databases),2013.
324ADVANCESINDIGITALFORENSICSX[9]B.
KlimtandY.
Yang,IntroducingtheEnronCorpus,presentedattheFirstConferenceonEmailandAnti-Spam,2004.
[10]B.
KlimtandY.
Yang,TheEnronCorpus:Anewdatasetforemailclassicationresearch,ProceedingsoftheFifteenthEuropeanCon-ferenceonMachineLearning,pp.
217–226,2004.
[11]LincolnLaboratory,MassachusettsInstituteofTechnology,DARPAIntrusionDetectionDataSets,Lexington,Massachusetts(www.
ll.
mit.
edu/mission/communications/cyber/CSTcorpora/ideval/data),2013.
[12]R.
Lippmann,D.
Fried,I.
Graf,J.
Haines,K.
Kendall,D.
McClung,D.
Weber,S.
Webster,D.
Wyschogrod,R.
CunninghamandM.
Zissman,Evaluatingintrusiondetectionsystems:The1998DARPAo-lineintrusiondetectionevaluation,ProceedingsoftheDARPAInformationSurvivabilityConferenceandExposition,vol.
2,pp.
12–26,2000.
[13]R.
Lippmann,J.
Haines,D.
Fried,J.
KorbaandK.
Das,The1999DARPAo-lineintrusiondetectionevaluation,ComputerNetworks,vol.
34(4),pp.
579–595,2000.
[14]E.
Lundin,H.
KvarnstromandE.
Jonsson,Asyntheticfrauddatagenerationmethodology,ProceedingsoftheFourthInternationalConferenceonInformationandCommunicationsSecurity,pp.
265–277,2002.
[15]E.
LundinBarse,H.
KvarnstromandE.
Jonsson,Synthesizingtestdataforfrauddetectionsystems,ProceedingsoftheNineteenthAnnualComputerSecurityApplicationsConference,pp.
384–394,2003.
[16]J.
McHugh,Testingintrusiondetectionsystems:Acritiqueofthe1998and1999DARPAintrusiondetectionsystemevaluationsasperformedbyLincolnLaboratory,ACMTransactionsonInforma-tionandSystemSecurity,vol.
3(4),pp.
262–294,2000.
[17]C.
MochandF.
Freiling,TheForensicImageGeneratorGenerator(Forensig2),ProceedingsoftheFifthInternationalConferenceonITSecurityIncidentManagementandITForensics,pp.
78–93,2009.
[18]C.
MochandF.
Freiling,EvaluatingtheForensicImageGeneratorGenerator,ProceedingsoftheThirdInternationalConferenceonDigitalForensicsandCyberCrime,pp.
238–252,2011.
[19]NationalInstituteofStandardsandTechnology,TheCFReDSProject,Gaithersburg,Maryland(www.
cfreds.
nist.
gov),2013.
Yannikos,Graner,Steinebach&Winter325[20]K.
RicanekandT.
Tesafaye,Morph:Alongitudinalimagedatabaseofnormaladultage-progression,ProceedingsoftheSeventhInter-nationalConferenceonAutomaticFaceandGestureRecognition,pp.
341–345,2006.
[21]M.
Steinebach,H.
LiuandY.
Yannikos,FaceHash:Facedetectionandrobusthashing,presentedattheFifthInternationalConferenceonDigitalForensicsandCyberCrime,2013.
[22]T.
Vidas,MemCorp:Anopendatacorpusformemoryanalysis,ProceedingsoftheForty-FourthHawaiiInternationalConferenceonSystemSciences,2011.
[23]Volatilty,TheVolatilityFramework(code.
google.
com/p/volatility),2014.
[24]WikiLeaks,TheGlobalIntelligenceFiles(wikileaks.
org/the-gifiles.
html),2013.
[25]K.
Woods,C.
Lee,S.
Garnkel,D.
Dittrich,A.
RussellandK.
Kearton,Creatingrealisticcorporaforsecurityandforensiceduca-tion,ProceedingsoftheADFSLConferenceonDigitalForensics,SecurityandLaw,2011.
[26]Y.
Yannikos,F.
Franke,C.
WinterandM.
Schneider,3LSPG:Forensictoolevaluationbythreelayerstochasticprocess-basedgen-erationofdata,ProceedingsoftheFourthInternationalConferenceonComputationalForensics,pp.
200–211,2010.
[27]Y.
YannikosandC.
Winter,Model-basedgenerationofsyntheticdiskimagesfordigitalforensictooltesting,ProceedingsoftheEighthInternationalConferenceonAvailability,ReliabilityandSecurity,pp.
498–505,2013.
[28]Y.
Yannikos,C.
WinterandM.
Schneider,Syntheticdatacre-ationforforensictooltesting:Improvingperformanceofthe3LSPGFramework,ProceedingsoftheSeventhInternationalConferenceonAvailability,ReliabilityandSecurity,pp.
613–619,2012.

特网云(1050元),IP数5 个可用 IP (/29) ,美国高防御服务器 无视攻击

特网云特网云为您提供高速、稳定、安全、弹性的云计算服务计算、存储、监控、安全,完善的云产品满足您的一切所需,深耕云计算领域10余年;我们拥有前沿的核心技术,始终致力于为政府机构、企业组织和个人开发者提供稳定、安全、可靠、高性价比的云计算产品与服务。官方网站:https://www.56dr.com/ 10年老品牌 值得信赖 有需要的请联系======================特网云美国高防御...

DiyVM:2G内存/50G硬盘/元起线路香港vps带宽CN2线路,香港VPS五折月付50元起

DiyVM是一家低调国人VPS主机商,成立于2009年,提供的产品包括VPS主机和独立服务器租用等,数据中心包括香港沙田、美国洛杉矶、日本大阪等,VPS主机基于XEN架构,均为国内直连线路,主机支持异地备份与自定义镜像,可提供内网IP。最近,商家对香港机房VPS提供5折优惠码,最低2GB内存起优惠后仅需50元/月。下面就以香港机房为例,分享几款VPS主机配置信息。CPU:2cores内存:2GB硬...

ZJI:韩国BGP+CN2线路服务器,国内三网访问速度优秀,8折优惠码每月实付440元起

zji怎么样?zji最近新上韩国BGP+CN2线路服务器,国内三网访问速度优秀,适用8折优惠码zji,优惠后韩国服务器最低每月440元起。zji主机支持安装Linux或者Windows操作系统,会员中心集成电源管理功能,8折优惠码为终身折扣,续费同价,全场适用。ZJI是原Wordpress圈知名主机商:维翔主机,成立于2011年,2018年9月启用新域名ZJI,提供中国香港、台湾、日本、美国独立服...

linuxcp为你推荐
futureshop在国内还是在加拿大买笔记本2020双十一成绩单2020年河南全县初二期末成绩排名?access数据库ACCESS数据库有什么用老虎数码虎打个数字bbs.99nets.com怎么制作RO单机巫正刚阿迪三叶草彩虹板鞋的鞋带怎么穿?详细点,最后有图解。高分求网站检测如何进行网站全面诊断www.7788dy.comwww.tom365.com这个免费的电影网站有毒吗?se9999se.comexol.smtown.combaqizi.cc孔融弑母是真的吗?
最便宜的vps 独享100m softlayer 外贸主机 轻量 网通服务器ip 免费防火墙 tna官网 免费美国空间 昆明蜗牛家 paypal注册教程 购买国外空间 shopex主机 免费mysql数据库 华为云服务登录 申请网站 空间登入 广州虚拟主机 德讯 永久免费空间 更多