legraph

graphsearch  时间:2021-02-11  阅读:()
StructuredDuplicateDetectioninExternal-MemoryGraphSearchRongZhouandEricA.
HansenDepartmentofComputerScienceandEngineeringMississippiStateUniversityMississippiState,MS39762{rzhou,hansen}@cse.
msstate.
eduAbstractWeconsiderhowtouseexternalmemory,suchasdiskstor-age,toimprovethescalabilityofheuristicsearchinstate-spacegraphs.
TolimitthenumberofslowdiskI/Ooper-ations,wedevelopanewapproachtoduplicatedetectioningraphsearchthatlocalizesmemoryreferencesbypartitioningthesearchgraphbasedonanabstractionofthestatespace,andexpandingthefrontiernodesofthegraphinanorderthatrespectsthispartition.
Wedemonstratetheeffectivenessofthisapproachbothanalyticallyandempirically.
IntroductionHeuristicsearchinstate-spacegraphsisawidely-usedframeworkforproblemsolvinginAI,butitsscalabilityislimitedbymemoryrequirements.
Thishasledtoexten-siveresearchonhowtouseavailablememoryefcientlyinsearchinglargestate-spacegraphs.
Theconceptof"availablememory"isambiguous,sincememoryinmostcomputersystemshasahierarchicalstruc-tureinwhichfast,random-accessinternalmemoryiscom-plementedbylow-speedexternalmemory,suchasdiskstor-age.
Althoughexternalmemoryisvastlylarger(andmuchcheaper)thaninternalmemory,mostheuristicsearchal-gorithmsaredesignedtouseinternalmemoryonly,andmostresearchonmemory-efcientheuristicsearchassumesamemorymodelinwhichaccesstoallstoreddataitemsisequallyfast.
Algorithmsthatassumethismemorymodelperformextremelypoorlyifexternalmemoryisused,sincerandomaccessofexternalmemoryrequirestime-consumingI/Ooperationsthatareseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory.
Somerecentworkbeginstoaddressthisissue.
EdelkampandSchr¨odl(2000)consideruseofvirtualmemoryinA*graphsearch,andproposetechniquesthatchangethebest-rstorderofnodeexpansionsinawaythatimprovesrefer-encelocalityandreducesthenumberofpagefaultsinvirtualmemory.
AmoredirectapproachtolimitingslowdiskI/Oistodesignasearchalgorithmthatexplicitlymanagesaccesstoexternalmemory,sincethisapproachcanexploitknowl-edgeofhowthegraphisstructured,andisnotlimitedbythesizeofvirtualmemory,whichisusuallymuchsmallerthanexternalmemory.
InthetheoreticalcomputerscienceCopyrightc2004,AmericanAssociationforArticialIntelli-gence(www.
aaai.
org).
Allrightsreserved.
community,therehasbeenextensiveresearchonexternal-memoryalgorithms,includingdevelopmentofexternal-memorybreadth-rstsearchalgorithmsforexplicitlyrep-resentedgraphs(Munagala&Ranade1999;Mehlhorn&Meyer2002).
IntheAIcommunity,Korf(2003a;2003b)recentlydescribedanexternal-memorybreadth-rstsearchalgorithmforimplicitlyrepresentedgraphs.
Akeyissueindesigninganefcientexternal-memorygraphsearchalgorithmisduplicatedetection.
Ingraphsearch,thesamenodecanbereachedalongdifferentpaths,andpreventingredundantsearcheffortrequiresstoringalready-visitednodes,sothatthesearchalgorithmcanrec-ognizewhenitencountersanodethathasbeenalreadyexplored.
Becauseduplicatedetectionpotentiallyrequirescomparingeachnewlygeneratednodetoallstorednodes,itcanleadtocripplingdiskI/Oifalready-explorednodesarestoredondisk.
Allpreviousexternal-memorybreadth-rstsearchalgorithmslimitdiskI/Obyusingatechniquecalleddelayedduplicatedetectioninwhichanentiresetofnodesisexpandedbeforeperforminganyduplicatedetection.
Inthispaper,weintroduceanewapproach.
Insteadofdelayingduplicatedetection,welimitdiskI/Obylocalizingmem-oryreferencesbasedonthestructureofthesearchgraphasrevealedbyahigh-levelabstractionofthestatespace.
Com-putationalresultsonchallenginggraph-searchproblemsin-dicatethatthisapproach,whichwecallstructuredduplicatedetection,signicantlylimitsdiskI/O.
Forgraphswithsuf-cientlocalstructure,itallowsgraph-searchalgorithmstouseexternalmemoryalmostasefcientlyasinternalmemory.
BackgroundDevelopmentofexternal-memoryalgorithmsforvariouscomputationalproblemsisanactiveareaofresearch.
Be-causediskI/Oisseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory,theconceptofI/Ocom-plexityhasbeenintroducedtoanalyzetheperformanceofexternal-memoryalgorithms(Aggarwal&Vitter1988).
Itassumesatwo-levelmodelofmemoryinwhichinternalmemoryofsizeMissupplementedbyexternalmemory,andanI/OoperationmovesdatainblocksofsizeBbetweenin-ternalmemoryandexternalmemory,where1searchhasbeenconsid-eredbyseveralresearchers.
Inthetheoreticalcomputersci-SEARCH683encecommunity,researchershavefocusedonestablishingboundsonitsI/Ocomplexity.
Agraphisassumedtobeexplicitlyrepresentedusingadjacencylistsandstoredinex-ternalmemoryduetoitslargesize.
Insearchingexplicitgraphs,worst-caseI/Ocomplexitydependsonthemethodofgeneratingsuccessornodesaswellasthemethodofdu-plicatedetection,sinceadjacencylistsmustbereadintointernalmemorytodeterminethesuccessorsofanode.
MunagalaandRanade(1999)describeanexternal-memorybreadth-rstsearchalgorithmsthatadoptsanapproachtosuccessorgenerationthathaslinearcomplexityinthenum-berofedgesinthegraph.
MehlhornandMeyer(2002)achievesublinearcomplexityforsuccessorgenerationbyamoresophisticatedapproachthatinvolvespartitioningtheadjacencylistbasedonadecompositionofthegraphintoconnectedsubgraphs.
Bothalgorithmsusethesamemethodofduplicatedetection,calleddelayedduplicatedetection,inwhichasetofnodesisexpandedwithoutperformingdupli-catedetection,andthemulti-setofgeneratedsuccessorsissortedandcheckedforduplicatesinasingle,moreefcientoperation.
Thealgorithmsalternatebetweenthesetwosteps.
1.
successorgeneration,inwhichthealgorithmgeneratessuccessorsforasetofnodesonthesearchfrontierandappendsthesesuccessorstoale(orles)intheorderinwhichtheyaregenerated,withoutperforminganydupli-catedetection,and2.
delayedduplicatedetection,inwhichthele(s)ofsucces-sornodesaresorted(usuallybyanexternal-memorysortalgorithm)basedontheirindicesorstateencodings,fol-lowedbyascanandcompactionstageinwhichduplicatenodesinthesortedle(s)areeliminated.
IntheAIcommunity,Korf(2003a;2003b)recentlyde-scribedanexternal-memorybreadth-rstsearchalgorithmthatusesthesamemethodofdelayedduplicatedetection.
Hisalgorithmdiffersintwoways.
First,heconsiderssearchinimplicitgraphs,inkeepingwiththestandardAIapproachtostate-spacesearch.
Animplicitgraphisacompactrep-resentationofagraphintheformofastartnode,anodeexpansionfunctionthatgeneratestheimmediatesuccessorsofanode,andapredicatethattestswhetheranodeisagoalnode.
Insearchingimplicitgraphs,successorgener-ationdoesnotrequirediskI/O,andduplicatedetectionistheonlysourceofI/Ocomplexity.
AseconddifferenceisthatKorfbuildshisexternal-memorybreadth-rstsearchalgorithmontopoffrontiersearch,amemory-efcientsearchalgorithmthatdoesnotneedtostoreaClosedlist(Korf&Zhang2000).
StructuredduplicatedetectionWeproposeanalternativeapproachtoduplicatedetectioninexternal-memorygraphsearchthathassomesignicantadvantagesoverdelayedduplicatedetection.
Itcanalsobecombinedwithdelayedduplicatedetectioninsomecases.
Thecentralideaofourapproachistoexploitthestruc-tureofastate-spacegraphinordertolocalizememoryref-erencesduringduplicatedetection,andsowecalltheap-proachstructuredduplicatedetection.
Asanexample,con-siderthe(n21)sliding-tilepuzzle.
Whenanewnodeisgenerated,itcanonlydifferfromitsparentnodewithre-specttothepositionofthe"blank"byeitheroneroworonecolumn.
Therefore,incheckingforduplicates,itisnotnec-essarytocheckstorednodesforwhichthepositionofthe"blank"differsfromthatoftheparentnodebymorethanoneroworonecolumn.
Ifwepartitionthesetofstorednodesinthisway,wecansignicantlylimitthenumberofstorednodesthatneedtobecheckedtoguaranteethatallduplicatesarefound.
Wecontinuetousethismotivatingex-ampleinthefollowing,tohelpexplaintheidea.
Projectionfunctionandabstractstate-spacegraphOurapproachisbasedonusingastate-spaceprojectionfunctiontodecomposeastate-spacegraph,andcreateanab-stractstate-spacegraphthatrevealsthelocalstructureoftheoriginalgraphatahigh-level.
Stateabstractioninheuris-ticsearchiswellstudied,andisusedtocreateadmissibleheuristicsandtoorganizehierarchicalsearch(Holteetal.
1996).
Weuseanapproachtostateabstractionthatisverysimilartopreviousapproaches,buthasadifferentpurpose:localizingmemoryreferencesinduplicatedetection.
Astate-spaceprojectionfunctionisamany-to-onemap-pingfromtheoriginalstatespacetoanabstractstatespace,inwhicheachabstractstatecorrespondstoasetofstatesintheoriginalstatespace.
Ifastatexismappedtoanabstractstatey,thenyiscalledtheimageofx,andxiscalledthepre-imageofy.
Therearemanywaystodeneastate-spaceprojectionfunction.
Acommonapproachistoignoresomestatevariablesintheencodingoftheproblem.
Forexam-ple,asimplestate-spaceprojectionfunctionforthe(n21)sliding-tilepuzzlecanbedenedbyignoringthepositionsofalltilesandconsideringonlythepositionofthe"blank.
"Inthiscase,anabstractstatecorrespondstoallstateswiththesamepositionofthe"blank,"andtherearen2abstractstatescomparedton2!
/2statesintheoriginalstatespace.
State-spaceprojectionfunctionscanbedenedinasimilarwayforothersearchproblems(Klein&Manning2003).
Givenastate-spacegraphandstate-spaceprojectionfunc-tion,anabstractstate-spacegraphisconstructedasfollows.
1.
Thesetofnodes,calledabstractnodes,intheabstractstate-spacegraphcorrespondstothesetofabstractstates.
2.
Anabstractnodeyisasuccessorofanabstractnodeyiffthereexisttwostatesxandx,suchthata.
xisasuccessorofx,andb.
xandxarepreimagesofyandy,respectively.
Ify=y,itmeansthattheabstractnodeyhasaselfloop.
Forexample,Figure1(a)showstheninepossiblepositionsofthe"blank"intheEight-puzzle.
Figure1(b)showstheabstractstate-spacegraphcreatedbythesimplestate-spaceprojectionfunctionthatmapsastateintoanabstractstatebasedonlyonthepositionofthe"blank.
"EachabstractnodeBiinFigure1(b)correspondstothesetofstateswiththe"blank"locatedatpositioniinFigure1(a).
Duplicate-detectionscopeAlthoughtransformingastate-spacegraphintoanabstractstate-spacegraphcanresultinexponentialreductioninthe684SEARCHFigure1:Panel(a)showsallpossiblepositionsofthe"blank"fortheEight-puzzle.
Panel(b)showsanexampleofanabstractstate-spacegraphfortheEight-puzzle.
Thisdenitionofthestate-spaceprojectionfunctionisbasedonthepositionofthe"blank"only.
sizeofthegraph,thelocalstructureoftheoriginalstate-spacegraphcanbelargelypreserved.
Forexample,abstractnodeB0inFigure1(b)hasonlytwosuccessors;abstractnodesB1andB3.
Thiscapturesthefactthatasinglemoveofatilecanonlychangethepositionofthe"blank"byeitheronecolumn(B0→B1inthiscase)oronerow(B0→B3inthiscase).
Denition1Anabstractstate-spacegraphhaslocalstruc-tureifithasboundedout-degree.
Forthe(n21)sliding-tilepuzzleusingthissimplestate-spaceprojectionfunction,theout-degreeoftheabstractstate-spacegraphisboundedby4,foreveryn.
Thelocalstructureofanabstractstate-spacegraphcanbeusedtorestrictthescopeofduplicatedetectionsothatonlyafractionofstorednodesneedstobecheckedforduplicates,whilestillguaranteeingthatallduplicatesarefound.
Forexample,whengeneratingsuccessorsfornodesthatarepre-imagesofabstractnodeB0inFigure2,thealgorithmonlyneedstocheckforduplicatesagainststorednodesthatarepre-imagesofabstractnodeB1orB3.
Letanabstractnodey=p(x)betheimageofanodexunderastate-spaceprojectionfunctionp(·)andletsuccessors(y)bethesetofsuccessorabstractnodesofyintheabstractstate-spacegraph.
Denition2Theduplicate-detectionscopeofanodexun-derastate-spaceprojectionfunctionp(·)correspondstotheunionofsetsofstorednodesthatarepre-imagesofanab-stractnodeysuchthaty∈successors(p(x)),thatis,y∈successors(p(x))p1(y)wherep1(y)isthesetofstorednodesthatarepre-imagesofy.
Theorem1Theduplicate-detectionscopeofanodecon-tainsallstoredduplicatesofthesuccessorsofthenode.
Proof:Supposethatanodexhasapreviously-generatedsuccessornodexthatisnotincludedintheduplicate-detectionscopeofnodex.
Lety=p(x)betheimageofxunderthestate-spaceprojectionfunctionp(·).
Accord-ingtoDenition2,wehavey/∈successors(p(x)).
Butaccordingtothedenitionofanabstractstate-spacegraph,ifxisasuccessorofx,thenp(x)mustbeasuccessorofFigure2:Theduplicate-detectionscopeofnodesthatarepre-imagesofabstractnodeB0includesnodesthatarepre-imagesofabstractnodeB1orB3.
Whenexpandingnodesthatarepre-imagesofabstractnodeB0,thesearchalgorithmcanuseinternalmemorytostorenodesthatarepre-imagesofabstractnodeB0,B1,orB3,anduseexternalmemorytostoretherestofthenodes.
p(x).
Inotherwords,y∈successors(p(x)).
Sincethisleadstoacontradiction,Theorem1musthold.
2Theconceptofduplicate-detectionscopecanbeveryuse-fulinexternal-memorygraphsearch,becauseitsuggeststhatasearchalgorithmuseinternalmemorytostorenodeswithintheduplicate-detectionscopeofasetofexpandingnodes,anduseexternalmemorytostoreothernodes,wheninternalmemoryisfull.
Figure2illustratesthisidea.
Weusethetermnblocktorefertoaset(or"block")ofnodesintheoriginalstatespacethatcorrespondto(i.
e.
,arepre-imagesof)anabstractnode.
Ifanabstractstate-spacegraphhaslocalstructure,thentheduplicate-detectionscopeofanynodeconsistsofaboundednumberofnblocks,andthelargestduplicatedetectionscopecanbeasmallfractionoftheoverallnumberofstorednodes.
Notethatthelargestduplicatedetectionscopeestablishesaminimuminternal-memoryrequirementforstructuredduplicatedetection.
SearchusingstructuredduplicatedetectionWenowdescribehowtointegratestructuredduplicatede-tectionintoastate-spacesearchalgorithm.
Similartodelayedduplicatedetection,structureddupli-catedetectionisusedwithasearchalgorithmthatexpandsasetofnodesatatime,suchthattheunderlyingsearchstrat-egyisconsistentwithexpandingthenodesinthissetinanyorder.
Thisisstraightforwardinbreadth-rstsearch,whereasetofnodescorrespondstoallnodesinthesamelayerofthebreadth-rstsearchgraph.
(Welaterexplainhowitcanapplytobest-rstsearch.
)Giventhissetofnodes,struc-turedduplicatedetectiondeterminesanorderofexpansionthatminimizesI/Ocomplexity,byexploitinglocalstructurerevealedbytheabstractstate-spacegraph.
Tosupportstructuredduplicatedetection,thesetofstorednodes(i.
e.
,nodesintheOpenandClosedlists)ispartitionedbasedonanabstractstate-spacegraph.
Eachnblockinthepartitionconsistsofallstorednodesthatarepre-imagesofthesameabstractnode(i.
e.
,eachnblockcorrespondstoanabstractnode).
Thislocalstructureisexploitedasfollows.
First,nodesinthesamenblockareexpandedtogether,i.
e.
,consecutively.
Thisimproveslocalityofmemoryref-erencebecauseallnodesinthesamenblockhavethesameduplicate-detectionscope.
ButitleavesundeterminedtheSEARCH685orderinwhichtoconsidernblocks.
Ingeneral,iftheduplicate-detectionscopesoftwonblocksarealmostthesame,theyshouldbeexpandedclosetoeachother,inordertominimizethenumberofpossibleI/Ooperations.
Nblockscorrespondtoabstractnodes,andabstractnodesthatareneighborsintheabstractstate-spacegraphtendtohavesim-ilarduplicate-detectionscopes.
Therefore,wechoosetoex-pandnblocksinanorderthatreectsneighborrelationsintheabstractstate-spacespacegraph.
Asimplewaytodothisistoexpandthenblocksinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph.
Wheninternalmemoryisfull,thesearchalgorithmmustremovefrominternalmemoryoneormorenblocksthatdonotbelongtotheduplicate-detectionscopeofnodescurrentlybeingexpanded.
Immediatelybeforeexpandingnodesinadifferentnblock,itmustcheckifsomenblocksintheirduplicate-detectionscopearemissingfrominternalmemory,andifso,readthemfromexternalmemory.
Be-causereadingannblockintointernalmemoryoftenresultsinwritinganothernblocktoexternalmemory,werefertothesepairsofreadandwriteoperationsasnblockreplace-ments.
Exceptfornblocksthatarepartoftheduplicate-detectionscopeofnodesbeingexpanded,anynblockcanpotentiallybemovedfrominternalmemorytoexternalmemory.
De-cidingwhichnblockstoremoveisidenticaltothepage-replacementstrategyofvirtualmemory,exceptthatnowtheunitofreplacementisannblockinsteadofapage.
Thus,wecallitannblock-replacementstrategy.
Itiswell-knownthattheoptimalreplacementstrategyalwaysremovesthepage(ornblock)thatwillnotbeusedforthelongesttime(Be-lady1966).
Implementingsuchastrategyisgenerallyim-possible,asitrequiresfutureknowledgeoftheorderinwhichpages(nblocks)willbeneeded,andleast-recently-usedisoftenthebeststrategyinpractice(Sleator&Tar-jan1985).
However,itispossibletocomputeanoptimalnblock-replacementstrategyinourcase,sincetheorderinwhichnblocksareexpandedisgivenbythebreadth-rsttraversalorderofanabstractstate-spacegraph.
Thus,givenasetofnodestoexpand,structuredduplicatedetectionexpandstheminanordersuchthat(1)nodesinthesamenblockareexpandedtogether,(2)nblocksareconsid-eredinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph,and,(3)wheninternalmemorybecomesfull,selectionofwhichnblockstowritetodiskisbasedonapre-computedoptimalnblock-replacementstrat-egy,orelse,theleast-recentlyusedstrategy.
Ititstraightforwardtousestructuredduplicatedetectionwithbreadth-rstsearch,wherethesetofnodesbeingex-pandedisalayerofthebreadth-rstsearchgraph.
Itisalsopossibletouseitwithabest-rstsearchalgorithmsuchasA*.
Insolvingasearchproblemwithmanyties,theorderinwhichA*expandsnodeswiththesamef-valueisnon-deterministic,andstructuredduplicatedetectioncandeter-mineanorderthatminimizesI/Ocomplexity.
The(n21)sliding-tilepuzzleisanexampleofadomainwithmanyties.
Forsearchproblemsinwhichtherearenotmanyties,itispossibletousestructuredduplicatedetectionaspartofaslightly-modiedversionofA*,inwhichA*selectsasetofnodestoexpandcontainingnodeswithalmostthesamef-value–forexample,thebestknodesontheOpenlist,wherekisasuitablylargenumber.
Structuredduplicatede-tectioncandeterminetheorderinwhichtoexpandthissetofnodes.
Althoughitmayexpandnodesinanorderthatisnotstrictlybest-rst,improvedperformancefromstructureddu-plicatedetectionmayoutweighanincreaseinthenumberofnodesexpanded.
Thismodicationofthebest-rstexpan-sionorderofA*toimprovelocalityofmemoryreferenceswaspreviouslyproposedbyEdelkampandSchr¨odl(2000)asawayofimprovinguseofvirtualmemory.
I/OcomplexityI/Ocomplexityistheprimaryfactorthataffectsthespeedofexternal-memoryalgorithms.
TheI/OcomplexityofsearchinanimplicitgraphdependsentirelyontheI/Ocomplexityofduplicatedetection.
Theworst-caseI/OcomplexityofdelayedduplicatedetectionisO(n+mBlogMB(n+m)),wherenisthenumberofnodesinthestate-spacegraph,misthenumberofedges,Bisthesizeofadiskblock,Misthesizeofinternalmemory,andO(xBlogMBx)istheworst-caseI/Ocomplexityofexternallysortingxelements(Munagala&Ranade1999).
Forstructuredduplicatedetection,wehavethisresult.
Theorem2Ifeveryduplicate-detectionscopetsininter-nalmemory,theworst-caseI/Ocomplexityofstructureddu-plicatedetectionisO(nB·E).
Inthisresult,Eisthenumberofdirectededgesintheab-stractstate-spacegraph.
(Tokeeptheanalysissimple,weconsideranundirectededgeastworeciprocallydirectededges.
)ThetheoremfollowsfromthefactthatEistheworst-casenumberofnblockreplacementsthatmustbedoneintraversingtheabstractstate-spacegraph,andtheworst-casenumberofI/Ooperationsneededpernblockre-placementisthesizeofthelargestnblockdividedbythediskblocksize,wherenboundsthesizeofthelargestnblock.
Thefactor1Bisthesameinbothanalyses.
Sinceanab-stractstate-spacegraphistypicallyexponentiallysmallerthantheoriginalstate-spacegraph,thefactorEforstruc-turedduplicatedetectioniscomparabletologMB(n+m)fordelayedduplicateddetection.
ThedifferenceinI/Ocom-plexityisthedifferencebetweenthefactorn+mfordelayedduplicatedetection,andthefactornforstructuredduplicatedetection.
Thepresenceofn+minthecomplexityanalysisfordelayedduplicatedetection,incontrasttonforstruc-turedduplicatedetection,canbeinterpretedasapenaltyfordelayingduplicatedetection,sinceitboundsthecardinalityofthemulti-setofsuccessorsthatisgeneratedbyexpandingasetofnodesbeforeperformingduplicatedetection.
Inahighly-connectedgraph,mismuchlargerthann,andthepenaltyfordelayingduplicatedetectioncanbese-vere.
Itfollowsthatdelayedduplicatedetectionworksbestinsparsegraphs.
However,manyimportantsearchproblemsdonothavesparsegraphs.
Anexampleismultiplesequencealignment,amotivatingproblemforfrontiersearch(Korf&Zhang2000).
Anadvantageofstructuredduplicatede-tectionisthatitworksequallywellinsparseandhighly-connectedgraphs.
686SEARCH#SolIntMemExtMemExpSecs1766564,62816,389,872279,167,41180149591,382,50420,557,691345,487,8638985364481,53312,588,843224,545,8536415655228,33412,989,362208,969,4456655957414,77513,829,299228,945,35167160661,738,02256,436,901978,819,6463,06966611,355,26420,699,891368,181,73597382621,410,29246,329,201765,608,9892,38988651,808,59177,711,2351,360,544,0934,4569257371,77912,505,737213,445,215642Table1:Performanceonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle.
Columnsshowtheinstancenumber(#);solutionlength(Sol);peaknumberofnodesstoredininternalmemory(IntMem);peaknumberofnodesstoredinexternalmemory(ExtMem),numberofnodeexpansions(Exp),andrunningtimeinCPUseconds(Secs).
Unlikedelayedduplicatedetection,structuredduplicatedetectionhasaminimuminternal-memoryrequirement.
Theorem2holdsonlyifeveryduplicatedetectionscopetsininternalmemory.
So,whetheritholdsdependspartlyonthesizeofinternalmemory,andpartlyonthelocalityofthegraph,andhowwellitiscapturedinanabstractstate-spacegraph.
Ifthelargestduplicatedetectionscopedoesnottininternalmemory,itispossibletocombinestructureddupli-catedetectionwithdelayedduplicatedetection.
Givenasetofnodestoexpand,thesetcanbepartitionedbasedontheabstractstate-spacegraph,anddelayedduplicatedetectioncanbeperformedseparatelyonthenodesineachnblock,asawayofleveraginglocalstructuretoimproveperformance.
ComputationalresultsHowefcientlytheunderlyinggraph-searchalgorithmusesinternalmemoryhasaneffectondiskI/O,sinceitaf-fectshowmuchtotalmemoryisneeded.
Forthisreason,Korf(2003a;2003b)builthisexternal-memorygraphsearchalgorithmontopoffrontiersearch,averymemory-efcientgraph-searchalgorithm.
Inourexperiments,wealsocom-binestructuredduplicatedetectionwithmemory-efcientgraph-searchalgorithms.
Adifferenceisthatwecon-siderheuristicsearch,whereaspreviousworkonexternal-memorygraphsearchconsidersbreadth-rstsearchonly.
FortheresultsinTables1and2,weranthesearchalgo-rithmsinamodethatminimizesuseofinternalmemory.
Sothepeakamountofinternalmemoryusedcorrespondstotheminimuminternal-memoryrequirementsofthealgorithms,usingstructuredduplicatedetectionandagivenabstractionofthestatespace.
Ordinarily,thealgorithmswoulduseallavailableRAMbeforeusingdisk.
Thealgorithmswererunona2.
4GHzIntelPentiumwith2gigabytesofRAManda7200RPMSeagatediskwith120gigabytesofstorage.
Fifteen-puzzleTosolvetheFifteenpuzzle,weusestructuredduplicatedetectiontogetherwithabreadth-rstheuristicsearchal-gorithmthatusesdivide-and-conquersolutionreconstruc-tion,calledBreadth-FirstIterative-DeepeningA*(Zhou&Hansen2004).
Unlikebrute-forcebreadth-rstsearch,thisNameCostIntMemExtMemExpSecs1aboA8,4838K130K4,207K451amk33,96029K555K31,704K3731csy14,1606K52K1,943K251ezm40,73317K154K6,186K731gtr58,010188K7,468K1,094,936K22,3811idy7,88819K415K10,246K1101pfc15,71812K153K6,600K711tis37,58124K383K29,778K3571wit13,97953K1,513K80,591K1,192actin52,117102K2,783K240,168K3,972Table2:Performanceinaligninggroupsof5proteinsequencesfromreferenceset1ofBAliBASE(Thompson,Plewniak,&Poch1999).
Columnsshownameofinstance,costofoptimalalignment,peaknumberofnodesstoredininternalmemory(inthousands),peaknumberofnodesstoredinexternalmemory(inthousands),numberofnodeexpansions(inthousands),andCPUseconds.
algorithmusesupperandlowerboundstoprunenodesthatcannotbepartofanoptimalsolution.
Table1showshowthealgorithmperformsonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle(Korf1985).
Weusedastate-spaceprojectionfunctionthatgroupsto-getherstatesbasedonthepositionsoftiles15and8,plusthepositionofthe"blank,"creatinganabstractstate-spacegraphwith16·15·14=3360nodes.
Basedonthispartitionofthestatespace,structuredduplicatedetectionreducestheinternal-memoryrequirementofthesearchalgorithmbyafactorofbetween16and58times.
Inexchange,itincreasesrunningtimebyonlyabout24%insolvingtheseexam-ples.
Usingbothstructuredduplicatedetectionanddivide-and-conquersolutionreconstruction,all100instancesofthe15-puzzlearesolvedusingnomorethan42megabytesofRAM,withouteverre-expandinganodeinthesameitera-tion.
Basedonthenumberofdistinctnodesexpanded,A*wouldneed28gigabytesofRAMjusttostoretheClosedlistinsolvinginstance88.
MultiplesequencealignmentThemultiplesequencealignmentproblemisanexampleofasearchproblemthatcannotbesolvedefcientlyus-ingdelayedduplicatedetectionbecauseitssearchgraphisveryhighly-connected.
Thesearchgraphforthemulti-plesequencealignmentproblemisann-dimensionalhyper-lattice,wherenisthenumberofsequencesbeingaligned.
WeimplementedstructuredduplicatedetectionontopofasearchalgorithmcalledSweepA*,whichusesdivide-and-conquersolutionreconstructiontolimituseofmem-ory(Zhou&Hansen2003).
1Wetestedtheresultingal-gorithmonaselectionofdifcultmultiplesequencealign-mentproblemsfromreferenceset1ofBAliBASE,awidely-usedbenchmark(Thompson,Plewniak,&Poch1999).
All1SweepA*isaspecializedsearchalgorithmformultiplese-quencealignmentthatexpandsnodesonalayer-by-layerbasis,whichmakesitagoodtforstructuredduplicatedetection.
Itisalsoverymemory-efcient.
Forcomparison,frontiersearch(Korf&Zhang2000)cannotsolveinstances1gtr,1wit,oractininTa-ble2within2gigabytesofRAM,andusesanaverageof25timesmorememorythanSweepA*insolvingtheotherinstances.
SEARCH687problemsinvolvedaligning5proteinsequences.
OurcostfunctionwasaDayhoffsubstitutionmatrixwithlineargappenaltyof8,andweusedapairwisealignmentadmissibleheuristic.
Tocreateanabstractstate-spacegraph,weusedastate-spaceprojectionfunctionthatignoresallbut3se-quences.
ThiscreatesanabstractstatespacewithO(l3)ab-stractnodes,wherelistheaveragelengthofasequence,comparedtoO(l5)nodesintheoriginalstate-spacegraph.
Table2showsthatstructuredduplicatedetectionreducestheinternal-memoryrequirementsofSweepA*byafactorofbetween10and40times.
Usingdisk,thealgorithmneedsonly20megabytesofRAMtosolveallinstancesinTable2,includingthespaceforthepairwiseheuristic.
Interestingly,theexternal-memoryversionofSweepA*runsfasterthananinternal-memoryversionofSweepA*thatdoesnotusestructuredduplicatedetection,byanaverageof72%!
Thereasonisthatstructuredduplicatedetectionrunsfasterinin-ternalmemorythanun-structuredduplicatedetectionduetolocalityofmemoryreferences,andthisspeedupoutweighstheextratimefordiskI/O.
Thespeedupismoreevidentinmultiplesequencealignmentthanthesliding-tilepuz-zlebecausethemultiplesequencealignmentsearchgraphismuchmorehighly-connected,whichsignicantlyincreasesthenumberofduplicatesgeneratedpernodeexpansion.
Speedingupinternal-memorysearchWeemphasizethatforboththeFifteenpuzzleandmulti-plesequencealignment,structuredduplicatedetectionim-provestheperformanceofinternal-memorygraphsearch,inadditiontousingexternalmemoryefciently.
Thisisbe-causeeachtimethesearchalgorithmchecksforduplicates,itonlyneedstocheckasmallsubsetofthestorednodes.
FortheFifteenpuzzle,thisleadstoa16%speedupininternal-memoryperformance.
Formultiplesequencealignment,itcutstherunningtimeofinternal-memorysearchinhalf.
Again,thespeedupismuchgreaterformultiplesequencealignmentbecauseinaveryhighly-connectedgraph,manymoreduplicatesaregenerated.
Thisunderscoresabenetofstructuredduplicatedetection.
Unlikedelayedduplicatedetection,itismoreeffectiveinhighly-connectedgraphs,wheretheproblemofduplicatedetectionismorecrucial.
ConclusionWehaveintroducedstructuredduplicatedetection,anovelapproachtolocalizingmemoryreferencesinduplicatede-tection,andshowedthatitcanreducetheI/Ocomplexityofexternal-memorygraphsearch,aswellasincreasethespeedofduplicatedetectionininternal-memorysearch.
Structuredduplicatedetectionhassomeadvantagesoverdelayedduplicatedetection,theonlypreviousapproachtoexternal-memorygraphsearch.
Inparticular,itismoreef-fectiveinhighly-connectedgraphs,anditcanexploitlocalgraphstructurethatisnotconsideredbydelayedduplicatedetection.
ButstructuredduplicatedetectionanddelayedduplicatedetectioncanalsobeviewedascomplementaryapproachestoreducingI/Ocomplexity,andcanbeusedto-getherfordifcultsearchproblems.
Ifasearchalgorithmcanleverageenoughlocalityinastate-spacegraphsothateveryduplicate-detectionscopetsininternalmemory,itisunnecessarytoeverdelayduplicatedetection,andstruc-turedduplicatedetectioncanmanageexternalmemorybyitself.
Butifanyduplicate-detectionscopedoesnottinin-ternalmemory(perhapsduetolackofsufcientlocalstruc-tureinthegraph),structuredduplicatedetectionbyitselfisnotsufcient.
Sincedelayedduplicatedetectiondoesnothaveaminimummemoryrequirement,itcanbeusedinthiscase.
Structuredduplicatedetectioncanbeusedtogetherwithdelayedduplicatedetectiontoimproveitsperformance,byleveraginggraphlocalityasmuchaspossible.
AcknowledgmentsWethanktheanonymousreviewersforhelpfulcomments.
ThisworkwassupportedinpartbyNSFgrantIIS-9984952andNASAgrantNAG-2-1463.
ReferencesAggarwal,A.
,andVitter,J.
1988.
Theinput/outputcomplexityofsortingandrelatedproblems.
Comm.
oftheACM31(9):1116–27.
Belady,L.
1966.
Astudyofreplacementalgorithmsforvirtualstorage.
IBMSystemsJournal5:78–101.
Edelkamp,S.
,andSchr¨odl,S.
2000.
LocalizingA*.
InProc.
ofthe17thNationalConferenceonArticialIntelligence,885–890.
Holte,R.
;Mkadmi,T.
;Zimmer,R.
;andMacDonald,A.
1996.
Speedingupproblemsolvingbyabstraction:Agraph-orientedapproach.
ArticialIntelligence85(1–2):321–361.
Klein,D.
,andManning,C.
2003.
FactoredA*searchformodelsoversequencesandtrees.
InProceedingsofthe17thInternationalJointConferenceonArticialIntelligence,1246–1251.
Korf,R.
,andZhang,W.
2000.
Divide-and-conquerfrontiersearchappliedtooptimalsequencealignment.
InProceedingsofthe17thNationalConferenceonArticialIntelligence,910–916.
Korf,R.
1985.
Depth-rstiterativedeepening:Anoptimalad-missibletreesearch.
ArticialIntelligence27:97–109.
Korf,R.
2003a.
Breadth-rstfrontiersearchwithdelayeddupli-catedetection.
InProceedingsoftheWorkshoponModelCheck-ingandArticialIntelligenceatIJCAI-03,87–92.
Korf,R.
2003b.
Delayedduplicatedetection:Extendedabstract.
InProceedingsofthe17thInternationalJointConferenceonAr-ticialIntelligence,1539–1541.
Mehlhorn,K.
,andMeyer,U.
2002.
External-memorybreadth-rstsearchwithsublinearI/O.
InProceedingsofthe10thAnnualEuropeanSymposiumonAlgorithms,723–735.
Munagala,K.
,andRanade,A.
1999.
I/O-complexityofgraphalgorithms.
InProceedingsofthe10thSymposiumondiscretealgorithms,687–694.
ACM-SIAM.
Sleator,D.
,andTarjan,R.
1985.
Amortizedefciencyoflistupdateandpagingrules.
CommunicationsoftheACM28:202–8.
Thompson,J.
;Plewniak,F.
;andPoch,O.
1999.
BAliBASE:Abenchmarkalignmentdatabasefortheevaluationofmultiplealignmentprograms.
Bioinformatics15(1):87–88.
Zhou,R.
,andHansen,E.
2003.
SweepA*:Space-efcientheuristicsearchinpartiallyorderedgraphs.
InProc.
of15thIEEEInternationalConf.
onToolswithArticialIntelligence,427–434.
Zhou,R.
,andHansen,E.
2004.
Breadth-rstheuristicsearch.
InProceedingsofthe14thInternationalConferenceonAutomatedPlanningandScheduling.
688SEARCH

RAKsmart美国VPS上市,活动期间5折抢购仅$30,$1.99/月

RAKsmart机房将于7月1日~7月31日推出“年中大促”活动,多重惊喜供您选择;爆款I3-2120仅30美金秒杀、V4新品上市,活动期间5折抢购、爆款产品持续热卖、洛杉矶+硅谷+香港+日本站群恢复销售、G口不限流量产品超低价热卖。美国VPS、日本VPS及香港VPS享全场7折优惠;爆款VPS $ 1.99/月限量秒杀,10台/天,售完即止, VPS 7折优惠码:VPS-TP-disRAKsmar...

新加坡云服务器 1核2Gg 46元/月 香港云服务器 1核2G 74元/月 LightNode

LightNode是一家成立于2002年,总部位于香港的VPS服务商。提供基于KVM虚拟化技术.支持CentOS、Ubuntu或者Windows等操作系统。公司名:厦门靠谱云股份有限公司官方网站:https://www.lightnode.com拥有高质量香港CN2 GIA与东南亚节点(河内、曼谷、迪拜等)。最低月付7.71美金,按时付费,可随时取消。灵活满足开发建站、游戏应用、外贸电商等需求。首...

TmhHost暑假活动:高端线路VPS季付8折优惠,可选洛杉矶CN2 GIA/日本软银/香港三网CN2 GIA/韩国双向CN2等

tmhhost怎么样?tmhhost正在搞暑假大促销活动,全部是高端线路VPS,现在直接季付8折优惠,活动截止时间是8月31日。可选机房及线路有美国洛杉矶cn2 gia+200G高防、洛杉矶三网CN2 GIA、洛杉矶CERA机房CN2 GIA,日本软银(100M带宽)、香港BGP直连200M带宽、香港三网CN2 GIA、韩国双向CN2。点击进入:tmhhost官方网站地址tmhhost优惠码:Tm...

graphsearch为你推荐
幼儿搜狗拼音输入法4psbAchrome中南财经政法大学知识产权研究中心水土保持ios8存在问题的应用软件名单(2020年第四批)支持ipad平台操作使用手册css3圆角用CSS3怎么实现圆角边框?ipadwifiipad的wifi打不开怎么办?iexplore.exe应用程序错误iexplore.exe - 应用程序错误怎么办阿??????
jsp虚拟主机 域名注册中心 阿云浏览器 googleapps idc测评网 sockscap 京东商城双十一活动 老左正传 183是联通还是移动 速度云 免费吧 绍兴电信 电信主机 万网主机管理 1元域名 带宽租赁 监控服务器 中国域名 万网空间 lamp的音标 更多