legraph

graphsearch  时间:2021-02-11  阅读:()
StructuredDuplicateDetectioninExternal-MemoryGraphSearchRongZhouandEricA.
HansenDepartmentofComputerScienceandEngineeringMississippiStateUniversityMississippiState,MS39762{rzhou,hansen}@cse.
msstate.
eduAbstractWeconsiderhowtouseexternalmemory,suchasdiskstor-age,toimprovethescalabilityofheuristicsearchinstate-spacegraphs.
TolimitthenumberofslowdiskI/Ooper-ations,wedevelopanewapproachtoduplicatedetectioningraphsearchthatlocalizesmemoryreferencesbypartitioningthesearchgraphbasedonanabstractionofthestatespace,andexpandingthefrontiernodesofthegraphinanorderthatrespectsthispartition.
Wedemonstratetheeffectivenessofthisapproachbothanalyticallyandempirically.
IntroductionHeuristicsearchinstate-spacegraphsisawidely-usedframeworkforproblemsolvinginAI,butitsscalabilityislimitedbymemoryrequirements.
Thishasledtoexten-siveresearchonhowtouseavailablememoryefcientlyinsearchinglargestate-spacegraphs.
Theconceptof"availablememory"isambiguous,sincememoryinmostcomputersystemshasahierarchicalstruc-tureinwhichfast,random-accessinternalmemoryiscom-plementedbylow-speedexternalmemory,suchasdiskstor-age.
Althoughexternalmemoryisvastlylarger(andmuchcheaper)thaninternalmemory,mostheuristicsearchal-gorithmsaredesignedtouseinternalmemoryonly,andmostresearchonmemory-efcientheuristicsearchassumesamemorymodelinwhichaccesstoallstoreddataitemsisequallyfast.
Algorithmsthatassumethismemorymodelperformextremelypoorlyifexternalmemoryisused,sincerandomaccessofexternalmemoryrequirestime-consumingI/Ooperationsthatareseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory.
Somerecentworkbeginstoaddressthisissue.
EdelkampandSchr¨odl(2000)consideruseofvirtualmemoryinA*graphsearch,andproposetechniquesthatchangethebest-rstorderofnodeexpansionsinawaythatimprovesrefer-encelocalityandreducesthenumberofpagefaultsinvirtualmemory.
AmoredirectapproachtolimitingslowdiskI/Oistodesignasearchalgorithmthatexplicitlymanagesaccesstoexternalmemory,sincethisapproachcanexploitknowl-edgeofhowthegraphisstructured,andisnotlimitedbythesizeofvirtualmemory,whichisusuallymuchsmallerthanexternalmemory.
InthetheoreticalcomputerscienceCopyrightc2004,AmericanAssociationforArticialIntelli-gence(www.
aaai.
org).
Allrightsreserved.
community,therehasbeenextensiveresearchonexternal-memoryalgorithms,includingdevelopmentofexternal-memorybreadth-rstsearchalgorithmsforexplicitlyrep-resentedgraphs(Munagala&Ranade1999;Mehlhorn&Meyer2002).
IntheAIcommunity,Korf(2003a;2003b)recentlydescribedanexternal-memorybreadth-rstsearchalgorithmforimplicitlyrepresentedgraphs.
Akeyissueindesigninganefcientexternal-memorygraphsearchalgorithmisduplicatedetection.
Ingraphsearch,thesamenodecanbereachedalongdifferentpaths,andpreventingredundantsearcheffortrequiresstoringalready-visitednodes,sothatthesearchalgorithmcanrec-ognizewhenitencountersanodethathasbeenalreadyexplored.
Becauseduplicatedetectionpotentiallyrequirescomparingeachnewlygeneratednodetoallstorednodes,itcanleadtocripplingdiskI/Oifalready-explorednodesarestoredondisk.
Allpreviousexternal-memorybreadth-rstsearchalgorithmslimitdiskI/Obyusingatechniquecalleddelayedduplicatedetectioninwhichanentiresetofnodesisexpandedbeforeperforminganyduplicatedetection.
Inthispaper,weintroduceanewapproach.
Insteadofdelayingduplicatedetection,welimitdiskI/Obylocalizingmem-oryreferencesbasedonthestructureofthesearchgraphasrevealedbyahigh-levelabstractionofthestatespace.
Com-putationalresultsonchallenginggraph-searchproblemsin-dicatethatthisapproach,whichwecallstructuredduplicatedetection,signicantlylimitsdiskI/O.
Forgraphswithsuf-cientlocalstructure,itallowsgraph-searchalgorithmstouseexternalmemoryalmostasefcientlyasinternalmemory.
BackgroundDevelopmentofexternal-memoryalgorithmsforvariouscomputationalproblemsisanactiveareaofresearch.
Be-causediskI/Oisseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory,theconceptofI/Ocom-plexityhasbeenintroducedtoanalyzetheperformanceofexternal-memoryalgorithms(Aggarwal&Vitter1988).
Itassumesatwo-levelmodelofmemoryinwhichinternalmemoryofsizeMissupplementedbyexternalmemory,andanI/OoperationmovesdatainblocksofsizeBbetweenin-ternalmemoryandexternalmemory,where1searchhasbeenconsid-eredbyseveralresearchers.
Inthetheoreticalcomputersci-SEARCH683encecommunity,researchershavefocusedonestablishingboundsonitsI/Ocomplexity.
Agraphisassumedtobeexplicitlyrepresentedusingadjacencylistsandstoredinex-ternalmemoryduetoitslargesize.
Insearchingexplicitgraphs,worst-caseI/Ocomplexitydependsonthemethodofgeneratingsuccessornodesaswellasthemethodofdu-plicatedetection,sinceadjacencylistsmustbereadintointernalmemorytodeterminethesuccessorsofanode.
MunagalaandRanade(1999)describeanexternal-memorybreadth-rstsearchalgorithmsthatadoptsanapproachtosuccessorgenerationthathaslinearcomplexityinthenum-berofedgesinthegraph.
MehlhornandMeyer(2002)achievesublinearcomplexityforsuccessorgenerationbyamoresophisticatedapproachthatinvolvespartitioningtheadjacencylistbasedonadecompositionofthegraphintoconnectedsubgraphs.
Bothalgorithmsusethesamemethodofduplicatedetection,calleddelayedduplicatedetection,inwhichasetofnodesisexpandedwithoutperformingdupli-catedetection,andthemulti-setofgeneratedsuccessorsissortedandcheckedforduplicatesinasingle,moreefcientoperation.
Thealgorithmsalternatebetweenthesetwosteps.
1.
successorgeneration,inwhichthealgorithmgeneratessuccessorsforasetofnodesonthesearchfrontierandappendsthesesuccessorstoale(orles)intheorderinwhichtheyaregenerated,withoutperforminganydupli-catedetection,and2.
delayedduplicatedetection,inwhichthele(s)ofsucces-sornodesaresorted(usuallybyanexternal-memorysortalgorithm)basedontheirindicesorstateencodings,fol-lowedbyascanandcompactionstageinwhichduplicatenodesinthesortedle(s)areeliminated.
IntheAIcommunity,Korf(2003a;2003b)recentlyde-scribedanexternal-memorybreadth-rstsearchalgorithmthatusesthesamemethodofdelayedduplicatedetection.
Hisalgorithmdiffersintwoways.
First,heconsiderssearchinimplicitgraphs,inkeepingwiththestandardAIapproachtostate-spacesearch.
Animplicitgraphisacompactrep-resentationofagraphintheformofastartnode,anodeexpansionfunctionthatgeneratestheimmediatesuccessorsofanode,andapredicatethattestswhetheranodeisagoalnode.
Insearchingimplicitgraphs,successorgener-ationdoesnotrequirediskI/O,andduplicatedetectionistheonlysourceofI/Ocomplexity.
AseconddifferenceisthatKorfbuildshisexternal-memorybreadth-rstsearchalgorithmontopoffrontiersearch,amemory-efcientsearchalgorithmthatdoesnotneedtostoreaClosedlist(Korf&Zhang2000).
StructuredduplicatedetectionWeproposeanalternativeapproachtoduplicatedetectioninexternal-memorygraphsearchthathassomesignicantadvantagesoverdelayedduplicatedetection.
Itcanalsobecombinedwithdelayedduplicatedetectioninsomecases.
Thecentralideaofourapproachistoexploitthestruc-tureofastate-spacegraphinordertolocalizememoryref-erencesduringduplicatedetection,andsowecalltheap-proachstructuredduplicatedetection.
Asanexample,con-siderthe(n21)sliding-tilepuzzle.
Whenanewnodeisgenerated,itcanonlydifferfromitsparentnodewithre-specttothepositionofthe"blank"byeitheroneroworonecolumn.
Therefore,incheckingforduplicates,itisnotnec-essarytocheckstorednodesforwhichthepositionofthe"blank"differsfromthatoftheparentnodebymorethanoneroworonecolumn.
Ifwepartitionthesetofstorednodesinthisway,wecansignicantlylimitthenumberofstorednodesthatneedtobecheckedtoguaranteethatallduplicatesarefound.
Wecontinuetousethismotivatingex-ampleinthefollowing,tohelpexplaintheidea.
Projectionfunctionandabstractstate-spacegraphOurapproachisbasedonusingastate-spaceprojectionfunctiontodecomposeastate-spacegraph,andcreateanab-stractstate-spacegraphthatrevealsthelocalstructureoftheoriginalgraphatahigh-level.
Stateabstractioninheuris-ticsearchiswellstudied,andisusedtocreateadmissibleheuristicsandtoorganizehierarchicalsearch(Holteetal.
1996).
Weuseanapproachtostateabstractionthatisverysimilartopreviousapproaches,buthasadifferentpurpose:localizingmemoryreferencesinduplicatedetection.
Astate-spaceprojectionfunctionisamany-to-onemap-pingfromtheoriginalstatespacetoanabstractstatespace,inwhicheachabstractstatecorrespondstoasetofstatesintheoriginalstatespace.
Ifastatexismappedtoanabstractstatey,thenyiscalledtheimageofx,andxiscalledthepre-imageofy.
Therearemanywaystodeneastate-spaceprojectionfunction.
Acommonapproachistoignoresomestatevariablesintheencodingoftheproblem.
Forexam-ple,asimplestate-spaceprojectionfunctionforthe(n21)sliding-tilepuzzlecanbedenedbyignoringthepositionsofalltilesandconsideringonlythepositionofthe"blank.
"Inthiscase,anabstractstatecorrespondstoallstateswiththesamepositionofthe"blank,"andtherearen2abstractstatescomparedton2!
/2statesintheoriginalstatespace.
State-spaceprojectionfunctionscanbedenedinasimilarwayforothersearchproblems(Klein&Manning2003).
Givenastate-spacegraphandstate-spaceprojectionfunc-tion,anabstractstate-spacegraphisconstructedasfollows.
1.
Thesetofnodes,calledabstractnodes,intheabstractstate-spacegraphcorrespondstothesetofabstractstates.
2.
Anabstractnodeyisasuccessorofanabstractnodeyiffthereexisttwostatesxandx,suchthata.
xisasuccessorofx,andb.
xandxarepreimagesofyandy,respectively.
Ify=y,itmeansthattheabstractnodeyhasaselfloop.
Forexample,Figure1(a)showstheninepossiblepositionsofthe"blank"intheEight-puzzle.
Figure1(b)showstheabstractstate-spacegraphcreatedbythesimplestate-spaceprojectionfunctionthatmapsastateintoanabstractstatebasedonlyonthepositionofthe"blank.
"EachabstractnodeBiinFigure1(b)correspondstothesetofstateswiththe"blank"locatedatpositioniinFigure1(a).
Duplicate-detectionscopeAlthoughtransformingastate-spacegraphintoanabstractstate-spacegraphcanresultinexponentialreductioninthe684SEARCHFigure1:Panel(a)showsallpossiblepositionsofthe"blank"fortheEight-puzzle.
Panel(b)showsanexampleofanabstractstate-spacegraphfortheEight-puzzle.
Thisdenitionofthestate-spaceprojectionfunctionisbasedonthepositionofthe"blank"only.
sizeofthegraph,thelocalstructureoftheoriginalstate-spacegraphcanbelargelypreserved.
Forexample,abstractnodeB0inFigure1(b)hasonlytwosuccessors;abstractnodesB1andB3.
Thiscapturesthefactthatasinglemoveofatilecanonlychangethepositionofthe"blank"byeitheronecolumn(B0→B1inthiscase)oronerow(B0→B3inthiscase).
Denition1Anabstractstate-spacegraphhaslocalstruc-tureifithasboundedout-degree.
Forthe(n21)sliding-tilepuzzleusingthissimplestate-spaceprojectionfunction,theout-degreeoftheabstractstate-spacegraphisboundedby4,foreveryn.
Thelocalstructureofanabstractstate-spacegraphcanbeusedtorestrictthescopeofduplicatedetectionsothatonlyafractionofstorednodesneedstobecheckedforduplicates,whilestillguaranteeingthatallduplicatesarefound.
Forexample,whengeneratingsuccessorsfornodesthatarepre-imagesofabstractnodeB0inFigure2,thealgorithmonlyneedstocheckforduplicatesagainststorednodesthatarepre-imagesofabstractnodeB1orB3.
Letanabstractnodey=p(x)betheimageofanodexunderastate-spaceprojectionfunctionp(·)andletsuccessors(y)bethesetofsuccessorabstractnodesofyintheabstractstate-spacegraph.
Denition2Theduplicate-detectionscopeofanodexun-derastate-spaceprojectionfunctionp(·)correspondstotheunionofsetsofstorednodesthatarepre-imagesofanab-stractnodeysuchthaty∈successors(p(x)),thatis,y∈successors(p(x))p1(y)wherep1(y)isthesetofstorednodesthatarepre-imagesofy.
Theorem1Theduplicate-detectionscopeofanodecon-tainsallstoredduplicatesofthesuccessorsofthenode.
Proof:Supposethatanodexhasapreviously-generatedsuccessornodexthatisnotincludedintheduplicate-detectionscopeofnodex.
Lety=p(x)betheimageofxunderthestate-spaceprojectionfunctionp(·).
Accord-ingtoDenition2,wehavey/∈successors(p(x)).
Butaccordingtothedenitionofanabstractstate-spacegraph,ifxisasuccessorofx,thenp(x)mustbeasuccessorofFigure2:Theduplicate-detectionscopeofnodesthatarepre-imagesofabstractnodeB0includesnodesthatarepre-imagesofabstractnodeB1orB3.
Whenexpandingnodesthatarepre-imagesofabstractnodeB0,thesearchalgorithmcanuseinternalmemorytostorenodesthatarepre-imagesofabstractnodeB0,B1,orB3,anduseexternalmemorytostoretherestofthenodes.
p(x).
Inotherwords,y∈successors(p(x)).
Sincethisleadstoacontradiction,Theorem1musthold.
2Theconceptofduplicate-detectionscopecanbeveryuse-fulinexternal-memorygraphsearch,becauseitsuggeststhatasearchalgorithmuseinternalmemorytostorenodeswithintheduplicate-detectionscopeofasetofexpandingnodes,anduseexternalmemorytostoreothernodes,wheninternalmemoryisfull.
Figure2illustratesthisidea.
Weusethetermnblocktorefertoaset(or"block")ofnodesintheoriginalstatespacethatcorrespondto(i.
e.
,arepre-imagesof)anabstractnode.
Ifanabstractstate-spacegraphhaslocalstructure,thentheduplicate-detectionscopeofanynodeconsistsofaboundednumberofnblocks,andthelargestduplicatedetectionscopecanbeasmallfractionoftheoverallnumberofstorednodes.
Notethatthelargestduplicatedetectionscopeestablishesaminimuminternal-memoryrequirementforstructuredduplicatedetection.
SearchusingstructuredduplicatedetectionWenowdescribehowtointegratestructuredduplicatede-tectionintoastate-spacesearchalgorithm.
Similartodelayedduplicatedetection,structureddupli-catedetectionisusedwithasearchalgorithmthatexpandsasetofnodesatatime,suchthattheunderlyingsearchstrat-egyisconsistentwithexpandingthenodesinthissetinanyorder.
Thisisstraightforwardinbreadth-rstsearch,whereasetofnodescorrespondstoallnodesinthesamelayerofthebreadth-rstsearchgraph.
(Welaterexplainhowitcanapplytobest-rstsearch.
)Giventhissetofnodes,struc-turedduplicatedetectiondeterminesanorderofexpansionthatminimizesI/Ocomplexity,byexploitinglocalstructurerevealedbytheabstractstate-spacegraph.
Tosupportstructuredduplicatedetection,thesetofstorednodes(i.
e.
,nodesintheOpenandClosedlists)ispartitionedbasedonanabstractstate-spacegraph.
Eachnblockinthepartitionconsistsofallstorednodesthatarepre-imagesofthesameabstractnode(i.
e.
,eachnblockcorrespondstoanabstractnode).
Thislocalstructureisexploitedasfollows.
First,nodesinthesamenblockareexpandedtogether,i.
e.
,consecutively.
Thisimproveslocalityofmemoryref-erencebecauseallnodesinthesamenblockhavethesameduplicate-detectionscope.
ButitleavesundeterminedtheSEARCH685orderinwhichtoconsidernblocks.
Ingeneral,iftheduplicate-detectionscopesoftwonblocksarealmostthesame,theyshouldbeexpandedclosetoeachother,inordertominimizethenumberofpossibleI/Ooperations.
Nblockscorrespondtoabstractnodes,andabstractnodesthatareneighborsintheabstractstate-spacegraphtendtohavesim-ilarduplicate-detectionscopes.
Therefore,wechoosetoex-pandnblocksinanorderthatreectsneighborrelationsintheabstractstate-spacespacegraph.
Asimplewaytodothisistoexpandthenblocksinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph.
Wheninternalmemoryisfull,thesearchalgorithmmustremovefrominternalmemoryoneormorenblocksthatdonotbelongtotheduplicate-detectionscopeofnodescurrentlybeingexpanded.
Immediatelybeforeexpandingnodesinadifferentnblock,itmustcheckifsomenblocksintheirduplicate-detectionscopearemissingfrominternalmemory,andifso,readthemfromexternalmemory.
Be-causereadingannblockintointernalmemoryoftenresultsinwritinganothernblocktoexternalmemory,werefertothesepairsofreadandwriteoperationsasnblockreplace-ments.
Exceptfornblocksthatarepartoftheduplicate-detectionscopeofnodesbeingexpanded,anynblockcanpotentiallybemovedfrominternalmemorytoexternalmemory.
De-cidingwhichnblockstoremoveisidenticaltothepage-replacementstrategyofvirtualmemory,exceptthatnowtheunitofreplacementisannblockinsteadofapage.
Thus,wecallitannblock-replacementstrategy.
Itiswell-knownthattheoptimalreplacementstrategyalwaysremovesthepage(ornblock)thatwillnotbeusedforthelongesttime(Be-lady1966).
Implementingsuchastrategyisgenerallyim-possible,asitrequiresfutureknowledgeoftheorderinwhichpages(nblocks)willbeneeded,andleast-recently-usedisoftenthebeststrategyinpractice(Sleator&Tar-jan1985).
However,itispossibletocomputeanoptimalnblock-replacementstrategyinourcase,sincetheorderinwhichnblocksareexpandedisgivenbythebreadth-rsttraversalorderofanabstractstate-spacegraph.
Thus,givenasetofnodestoexpand,structuredduplicatedetectionexpandstheminanordersuchthat(1)nodesinthesamenblockareexpandedtogether,(2)nblocksareconsid-eredinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph,and,(3)wheninternalmemorybecomesfull,selectionofwhichnblockstowritetodiskisbasedonapre-computedoptimalnblock-replacementstrat-egy,orelse,theleast-recentlyusedstrategy.
Ititstraightforwardtousestructuredduplicatedetectionwithbreadth-rstsearch,wherethesetofnodesbeingex-pandedisalayerofthebreadth-rstsearchgraph.
Itisalsopossibletouseitwithabest-rstsearchalgorithmsuchasA*.
Insolvingasearchproblemwithmanyties,theorderinwhichA*expandsnodeswiththesamef-valueisnon-deterministic,andstructuredduplicatedetectioncandeter-mineanorderthatminimizesI/Ocomplexity.
The(n21)sliding-tilepuzzleisanexampleofadomainwithmanyties.
Forsearchproblemsinwhichtherearenotmanyties,itispossibletousestructuredduplicatedetectionaspartofaslightly-modiedversionofA*,inwhichA*selectsasetofnodestoexpandcontainingnodeswithalmostthesamef-value–forexample,thebestknodesontheOpenlist,wherekisasuitablylargenumber.
Structuredduplicatede-tectioncandeterminetheorderinwhichtoexpandthissetofnodes.
Althoughitmayexpandnodesinanorderthatisnotstrictlybest-rst,improvedperformancefromstructureddu-plicatedetectionmayoutweighanincreaseinthenumberofnodesexpanded.
Thismodicationofthebest-rstexpan-sionorderofA*toimprovelocalityofmemoryreferenceswaspreviouslyproposedbyEdelkampandSchr¨odl(2000)asawayofimprovinguseofvirtualmemory.
I/OcomplexityI/Ocomplexityistheprimaryfactorthataffectsthespeedofexternal-memoryalgorithms.
TheI/OcomplexityofsearchinanimplicitgraphdependsentirelyontheI/Ocomplexityofduplicatedetection.
Theworst-caseI/OcomplexityofdelayedduplicatedetectionisO(n+mBlogMB(n+m)),wherenisthenumberofnodesinthestate-spacegraph,misthenumberofedges,Bisthesizeofadiskblock,Misthesizeofinternalmemory,andO(xBlogMBx)istheworst-caseI/Ocomplexityofexternallysortingxelements(Munagala&Ranade1999).
Forstructuredduplicatedetection,wehavethisresult.
Theorem2Ifeveryduplicate-detectionscopetsininter-nalmemory,theworst-caseI/Ocomplexityofstructureddu-plicatedetectionisO(nB·E).
Inthisresult,Eisthenumberofdirectededgesintheab-stractstate-spacegraph.
(Tokeeptheanalysissimple,weconsideranundirectededgeastworeciprocallydirectededges.
)ThetheoremfollowsfromthefactthatEistheworst-casenumberofnblockreplacementsthatmustbedoneintraversingtheabstractstate-spacegraph,andtheworst-casenumberofI/Ooperationsneededpernblockre-placementisthesizeofthelargestnblockdividedbythediskblocksize,wherenboundsthesizeofthelargestnblock.
Thefactor1Bisthesameinbothanalyses.
Sinceanab-stractstate-spacegraphistypicallyexponentiallysmallerthantheoriginalstate-spacegraph,thefactorEforstruc-turedduplicatedetectioniscomparabletologMB(n+m)fordelayedduplicateddetection.
ThedifferenceinI/Ocom-plexityisthedifferencebetweenthefactorn+mfordelayedduplicatedetection,andthefactornforstructuredduplicatedetection.
Thepresenceofn+minthecomplexityanalysisfordelayedduplicatedetection,incontrasttonforstruc-turedduplicatedetection,canbeinterpretedasapenaltyfordelayingduplicatedetection,sinceitboundsthecardinalityofthemulti-setofsuccessorsthatisgeneratedbyexpandingasetofnodesbeforeperformingduplicatedetection.
Inahighly-connectedgraph,mismuchlargerthann,andthepenaltyfordelayingduplicatedetectioncanbese-vere.
Itfollowsthatdelayedduplicatedetectionworksbestinsparsegraphs.
However,manyimportantsearchproblemsdonothavesparsegraphs.
Anexampleismultiplesequencealignment,amotivatingproblemforfrontiersearch(Korf&Zhang2000).
Anadvantageofstructuredduplicatede-tectionisthatitworksequallywellinsparseandhighly-connectedgraphs.
686SEARCH#SolIntMemExtMemExpSecs1766564,62816,389,872279,167,41180149591,382,50420,557,691345,487,8638985364481,53312,588,843224,545,8536415655228,33412,989,362208,969,4456655957414,77513,829,299228,945,35167160661,738,02256,436,901978,819,6463,06966611,355,26420,699,891368,181,73597382621,410,29246,329,201765,608,9892,38988651,808,59177,711,2351,360,544,0934,4569257371,77912,505,737213,445,215642Table1:Performanceonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle.
Columnsshowtheinstancenumber(#);solutionlength(Sol);peaknumberofnodesstoredininternalmemory(IntMem);peaknumberofnodesstoredinexternalmemory(ExtMem),numberofnodeexpansions(Exp),andrunningtimeinCPUseconds(Secs).
Unlikedelayedduplicatedetection,structuredduplicatedetectionhasaminimuminternal-memoryrequirement.
Theorem2holdsonlyifeveryduplicatedetectionscopetsininternalmemory.
So,whetheritholdsdependspartlyonthesizeofinternalmemory,andpartlyonthelocalityofthegraph,andhowwellitiscapturedinanabstractstate-spacegraph.
Ifthelargestduplicatedetectionscopedoesnottininternalmemory,itispossibletocombinestructureddupli-catedetectionwithdelayedduplicatedetection.
Givenasetofnodestoexpand,thesetcanbepartitionedbasedontheabstractstate-spacegraph,anddelayedduplicatedetectioncanbeperformedseparatelyonthenodesineachnblock,asawayofleveraginglocalstructuretoimproveperformance.
ComputationalresultsHowefcientlytheunderlyinggraph-searchalgorithmusesinternalmemoryhasaneffectondiskI/O,sinceitaf-fectshowmuchtotalmemoryisneeded.
Forthisreason,Korf(2003a;2003b)builthisexternal-memorygraphsearchalgorithmontopoffrontiersearch,averymemory-efcientgraph-searchalgorithm.
Inourexperiments,wealsocom-binestructuredduplicatedetectionwithmemory-efcientgraph-searchalgorithms.
Adifferenceisthatwecon-siderheuristicsearch,whereaspreviousworkonexternal-memorygraphsearchconsidersbreadth-rstsearchonly.
FortheresultsinTables1and2,weranthesearchalgo-rithmsinamodethatminimizesuseofinternalmemory.
Sothepeakamountofinternalmemoryusedcorrespondstotheminimuminternal-memoryrequirementsofthealgorithms,usingstructuredduplicatedetectionandagivenabstractionofthestatespace.
Ordinarily,thealgorithmswoulduseallavailableRAMbeforeusingdisk.
Thealgorithmswererunona2.
4GHzIntelPentiumwith2gigabytesofRAManda7200RPMSeagatediskwith120gigabytesofstorage.
Fifteen-puzzleTosolvetheFifteenpuzzle,weusestructuredduplicatedetectiontogetherwithabreadth-rstheuristicsearchal-gorithmthatusesdivide-and-conquersolutionreconstruc-tion,calledBreadth-FirstIterative-DeepeningA*(Zhou&Hansen2004).
Unlikebrute-forcebreadth-rstsearch,thisNameCostIntMemExtMemExpSecs1aboA8,4838K130K4,207K451amk33,96029K555K31,704K3731csy14,1606K52K1,943K251ezm40,73317K154K6,186K731gtr58,010188K7,468K1,094,936K22,3811idy7,88819K415K10,246K1101pfc15,71812K153K6,600K711tis37,58124K383K29,778K3571wit13,97953K1,513K80,591K1,192actin52,117102K2,783K240,168K3,972Table2:Performanceinaligninggroupsof5proteinsequencesfromreferenceset1ofBAliBASE(Thompson,Plewniak,&Poch1999).
Columnsshownameofinstance,costofoptimalalignment,peaknumberofnodesstoredininternalmemory(inthousands),peaknumberofnodesstoredinexternalmemory(inthousands),numberofnodeexpansions(inthousands),andCPUseconds.
algorithmusesupperandlowerboundstoprunenodesthatcannotbepartofanoptimalsolution.
Table1showshowthealgorithmperformsonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle(Korf1985).
Weusedastate-spaceprojectionfunctionthatgroupsto-getherstatesbasedonthepositionsoftiles15and8,plusthepositionofthe"blank,"creatinganabstractstate-spacegraphwith16·15·14=3360nodes.
Basedonthispartitionofthestatespace,structuredduplicatedetectionreducestheinternal-memoryrequirementofthesearchalgorithmbyafactorofbetween16and58times.
Inexchange,itincreasesrunningtimebyonlyabout24%insolvingtheseexam-ples.
Usingbothstructuredduplicatedetectionanddivide-and-conquersolutionreconstruction,all100instancesofthe15-puzzlearesolvedusingnomorethan42megabytesofRAM,withouteverre-expandinganodeinthesameitera-tion.
Basedonthenumberofdistinctnodesexpanded,A*wouldneed28gigabytesofRAMjusttostoretheClosedlistinsolvinginstance88.
MultiplesequencealignmentThemultiplesequencealignmentproblemisanexampleofasearchproblemthatcannotbesolvedefcientlyus-ingdelayedduplicatedetectionbecauseitssearchgraphisveryhighly-connected.
Thesearchgraphforthemulti-plesequencealignmentproblemisann-dimensionalhyper-lattice,wherenisthenumberofsequencesbeingaligned.
WeimplementedstructuredduplicatedetectionontopofasearchalgorithmcalledSweepA*,whichusesdivide-and-conquersolutionreconstructiontolimituseofmem-ory(Zhou&Hansen2003).
1Wetestedtheresultingal-gorithmonaselectionofdifcultmultiplesequencealign-mentproblemsfromreferenceset1ofBAliBASE,awidely-usedbenchmark(Thompson,Plewniak,&Poch1999).
All1SweepA*isaspecializedsearchalgorithmformultiplese-quencealignmentthatexpandsnodesonalayer-by-layerbasis,whichmakesitagoodtforstructuredduplicatedetection.
Itisalsoverymemory-efcient.
Forcomparison,frontiersearch(Korf&Zhang2000)cannotsolveinstances1gtr,1wit,oractininTa-ble2within2gigabytesofRAM,andusesanaverageof25timesmorememorythanSweepA*insolvingtheotherinstances.
SEARCH687problemsinvolvedaligning5proteinsequences.
OurcostfunctionwasaDayhoffsubstitutionmatrixwithlineargappenaltyof8,andweusedapairwisealignmentadmissibleheuristic.
Tocreateanabstractstate-spacegraph,weusedastate-spaceprojectionfunctionthatignoresallbut3se-quences.
ThiscreatesanabstractstatespacewithO(l3)ab-stractnodes,wherelistheaveragelengthofasequence,comparedtoO(l5)nodesintheoriginalstate-spacegraph.
Table2showsthatstructuredduplicatedetectionreducestheinternal-memoryrequirementsofSweepA*byafactorofbetween10and40times.
Usingdisk,thealgorithmneedsonly20megabytesofRAMtosolveallinstancesinTable2,includingthespaceforthepairwiseheuristic.
Interestingly,theexternal-memoryversionofSweepA*runsfasterthananinternal-memoryversionofSweepA*thatdoesnotusestructuredduplicatedetection,byanaverageof72%!
Thereasonisthatstructuredduplicatedetectionrunsfasterinin-ternalmemorythanun-structuredduplicatedetectionduetolocalityofmemoryreferences,andthisspeedupoutweighstheextratimefordiskI/O.
Thespeedupismoreevidentinmultiplesequencealignmentthanthesliding-tilepuz-zlebecausethemultiplesequencealignmentsearchgraphismuchmorehighly-connected,whichsignicantlyincreasesthenumberofduplicatesgeneratedpernodeexpansion.
Speedingupinternal-memorysearchWeemphasizethatforboththeFifteenpuzzleandmulti-plesequencealignment,structuredduplicatedetectionim-provestheperformanceofinternal-memorygraphsearch,inadditiontousingexternalmemoryefciently.
Thisisbe-causeeachtimethesearchalgorithmchecksforduplicates,itonlyneedstocheckasmallsubsetofthestorednodes.
FortheFifteenpuzzle,thisleadstoa16%speedupininternal-memoryperformance.
Formultiplesequencealignment,itcutstherunningtimeofinternal-memorysearchinhalf.
Again,thespeedupismuchgreaterformultiplesequencealignmentbecauseinaveryhighly-connectedgraph,manymoreduplicatesaregenerated.
Thisunderscoresabenetofstructuredduplicatedetection.
Unlikedelayedduplicatedetection,itismoreeffectiveinhighly-connectedgraphs,wheretheproblemofduplicatedetectionismorecrucial.
ConclusionWehaveintroducedstructuredduplicatedetection,anovelapproachtolocalizingmemoryreferencesinduplicatede-tection,andshowedthatitcanreducetheI/Ocomplexityofexternal-memorygraphsearch,aswellasincreasethespeedofduplicatedetectionininternal-memorysearch.
Structuredduplicatedetectionhassomeadvantagesoverdelayedduplicatedetection,theonlypreviousapproachtoexternal-memorygraphsearch.
Inparticular,itismoreef-fectiveinhighly-connectedgraphs,anditcanexploitlocalgraphstructurethatisnotconsideredbydelayedduplicatedetection.
ButstructuredduplicatedetectionanddelayedduplicatedetectioncanalsobeviewedascomplementaryapproachestoreducingI/Ocomplexity,andcanbeusedto-getherfordifcultsearchproblems.
Ifasearchalgorithmcanleverageenoughlocalityinastate-spacegraphsothateveryduplicate-detectionscopetsininternalmemory,itisunnecessarytoeverdelayduplicatedetection,andstruc-turedduplicatedetectioncanmanageexternalmemorybyitself.
Butifanyduplicate-detectionscopedoesnottinin-ternalmemory(perhapsduetolackofsufcientlocalstruc-tureinthegraph),structuredduplicatedetectionbyitselfisnotsufcient.
Sincedelayedduplicatedetectiondoesnothaveaminimummemoryrequirement,itcanbeusedinthiscase.
Structuredduplicatedetectioncanbeusedtogetherwithdelayedduplicatedetectiontoimproveitsperformance,byleveraginggraphlocalityasmuchaspossible.
AcknowledgmentsWethanktheanonymousreviewersforhelpfulcomments.
ThisworkwassupportedinpartbyNSFgrantIIS-9984952andNASAgrantNAG-2-1463.
ReferencesAggarwal,A.
,andVitter,J.
1988.
Theinput/outputcomplexityofsortingandrelatedproblems.
Comm.
oftheACM31(9):1116–27.
Belady,L.
1966.
Astudyofreplacementalgorithmsforvirtualstorage.
IBMSystemsJournal5:78–101.
Edelkamp,S.
,andSchr¨odl,S.
2000.
LocalizingA*.
InProc.
ofthe17thNationalConferenceonArticialIntelligence,885–890.
Holte,R.
;Mkadmi,T.
;Zimmer,R.
;andMacDonald,A.
1996.
Speedingupproblemsolvingbyabstraction:Agraph-orientedapproach.
ArticialIntelligence85(1–2):321–361.
Klein,D.
,andManning,C.
2003.
FactoredA*searchformodelsoversequencesandtrees.
InProceedingsofthe17thInternationalJointConferenceonArticialIntelligence,1246–1251.
Korf,R.
,andZhang,W.
2000.
Divide-and-conquerfrontiersearchappliedtooptimalsequencealignment.
InProceedingsofthe17thNationalConferenceonArticialIntelligence,910–916.
Korf,R.
1985.
Depth-rstiterativedeepening:Anoptimalad-missibletreesearch.
ArticialIntelligence27:97–109.
Korf,R.
2003a.
Breadth-rstfrontiersearchwithdelayeddupli-catedetection.
InProceedingsoftheWorkshoponModelCheck-ingandArticialIntelligenceatIJCAI-03,87–92.
Korf,R.
2003b.
Delayedduplicatedetection:Extendedabstract.
InProceedingsofthe17thInternationalJointConferenceonAr-ticialIntelligence,1539–1541.
Mehlhorn,K.
,andMeyer,U.
2002.
External-memorybreadth-rstsearchwithsublinearI/O.
InProceedingsofthe10thAnnualEuropeanSymposiumonAlgorithms,723–735.
Munagala,K.
,andRanade,A.
1999.
I/O-complexityofgraphalgorithms.
InProceedingsofthe10thSymposiumondiscretealgorithms,687–694.
ACM-SIAM.
Sleator,D.
,andTarjan,R.
1985.
Amortizedefciencyoflistupdateandpagingrules.
CommunicationsoftheACM28:202–8.
Thompson,J.
;Plewniak,F.
;andPoch,O.
1999.
BAliBASE:Abenchmarkalignmentdatabasefortheevaluationofmultiplealignmentprograms.
Bioinformatics15(1):87–88.
Zhou,R.
,andHansen,E.
2003.
SweepA*:Space-efcientheuristicsearchinpartiallyorderedgraphs.
InProc.
of15thIEEEInternationalConf.
onToolswithArticialIntelligence,427–434.
Zhou,R.
,andHansen,E.
2004.
Breadth-rstheuristicsearch.
InProceedingsofthe14thInternationalConferenceonAutomatedPlanningandScheduling.
688SEARCH

乌云数据(10/月),香港cera 1核1G 10M带宽/美国cera 8核8G10M

乌云数据主营高性价比国内外云服务器,物理机,本着机器为主服务为辅的运营理念,将客户的体验放在第一位,提供性价比最高的云服务器,帮助各位站长上云,同时我们深知新人站长的不易,特此提供永久免费虚拟主机,已提供两年之久,帮助了上万名站长从零上云官网:https://wuvps.cn迎国庆豪礼一多款机型史上最低价,续费不加价 尽在wuvps.cn香港cera机房,香港沙田机房,超低延迟CN2线路地区CPU...

RackNerd美国大硬盘服务器促销:120G SSD+192TB HDD,1Gbps大带宽,月付$599,促销美国月付$服务器促销带宽

racknerd怎么样?racknerd最近发布了一些便宜美国服务器促销,包括大硬盘服务器,提供120G SSD+192TB HDD,有AMD和Intel两个选择,默认32G内存,1Gbps带宽,每个月100TB流量,5个IP地址,月付$599。价格非常便宜,需要存储服务器的朋友可以关注一下。RackNerd主要经营美国圣何塞、洛杉矶、达拉斯、芝加哥、亚特兰大、新泽西机房基于KVM虚拟化的VPS、...

数脉科技香港自营,10Mbps CN2物理机420元/月

数脉科技怎么样?数脉科技品牌创办于2019,由一家从2012年开始从事idc行业的商家创办,目前主营产品是香港服务器,线路有阿里云线路和自营CN2线路,均为中国大陆直连带宽,适合建站及运行各种负载较高的项目,同时支持人民币、台币、美元等结算,提供支付宝、微信、PayPal付款方式。本次数脉科技给发来了新的7月促销活动,CN2+BGP线路的香港服务器,带宽10m起,配置E3-16G-30M-3IP,...

graphsearch为你推荐
三星itunes人才ipad支持ipadphotoshop技术ps是一种什么技术??????ipad上网ipad上网速度很慢怎么回事?tcpip上的netbios禁用tcp/ip上的netbios对网络应用软件的正常运行有没有影响?win7telnet怎样在win7下打开telnet 命令win7关闭135端口win7系统怎么关闭135端口?网上很多方法都不好用!win7关闭135端口win7下怎么关135和8909端口csshackcss中 *bottom是什么意思?
虚拟主机99idc 申请免费域名 老域名全部失效请记好新域名 a2hosting softlayer 精品网 香港主机 流媒体服务器 56折 174.127.195.202 150邮箱 asp免费空间申请 isp服务商 cdn加速是什么 中国网通测速 卡巴斯基免费试用版 国外视频网站有哪些 cloudlink qq金券 买空间网 更多