legraph

graphsearch  时间:2021-02-11  阅读:()
StructuredDuplicateDetectioninExternal-MemoryGraphSearchRongZhouandEricA.
HansenDepartmentofComputerScienceandEngineeringMississippiStateUniversityMississippiState,MS39762{rzhou,hansen}@cse.
msstate.
eduAbstractWeconsiderhowtouseexternalmemory,suchasdiskstor-age,toimprovethescalabilityofheuristicsearchinstate-spacegraphs.
TolimitthenumberofslowdiskI/Ooper-ations,wedevelopanewapproachtoduplicatedetectioningraphsearchthatlocalizesmemoryreferencesbypartitioningthesearchgraphbasedonanabstractionofthestatespace,andexpandingthefrontiernodesofthegraphinanorderthatrespectsthispartition.
Wedemonstratetheeffectivenessofthisapproachbothanalyticallyandempirically.
IntroductionHeuristicsearchinstate-spacegraphsisawidely-usedframeworkforproblemsolvinginAI,butitsscalabilityislimitedbymemoryrequirements.
Thishasledtoexten-siveresearchonhowtouseavailablememoryefcientlyinsearchinglargestate-spacegraphs.
Theconceptof"availablememory"isambiguous,sincememoryinmostcomputersystemshasahierarchicalstruc-tureinwhichfast,random-accessinternalmemoryiscom-plementedbylow-speedexternalmemory,suchasdiskstor-age.
Althoughexternalmemoryisvastlylarger(andmuchcheaper)thaninternalmemory,mostheuristicsearchal-gorithmsaredesignedtouseinternalmemoryonly,andmostresearchonmemory-efcientheuristicsearchassumesamemorymodelinwhichaccesstoallstoreddataitemsisequallyfast.
Algorithmsthatassumethismemorymodelperformextremelypoorlyifexternalmemoryisused,sincerandomaccessofexternalmemoryrequirestime-consumingI/Ooperationsthatareseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory.
Somerecentworkbeginstoaddressthisissue.
EdelkampandSchr¨odl(2000)consideruseofvirtualmemoryinA*graphsearch,andproposetechniquesthatchangethebest-rstorderofnodeexpansionsinawaythatimprovesrefer-encelocalityandreducesthenumberofpagefaultsinvirtualmemory.
AmoredirectapproachtolimitingslowdiskI/Oistodesignasearchalgorithmthatexplicitlymanagesaccesstoexternalmemory,sincethisapproachcanexploitknowl-edgeofhowthegraphisstructured,andisnotlimitedbythesizeofvirtualmemory,whichisusuallymuchsmallerthanexternalmemory.
InthetheoreticalcomputerscienceCopyrightc2004,AmericanAssociationforArticialIntelli-gence(www.
aaai.
org).
Allrightsreserved.
community,therehasbeenextensiveresearchonexternal-memoryalgorithms,includingdevelopmentofexternal-memorybreadth-rstsearchalgorithmsforexplicitlyrep-resentedgraphs(Munagala&Ranade1999;Mehlhorn&Meyer2002).
IntheAIcommunity,Korf(2003a;2003b)recentlydescribedanexternal-memorybreadth-rstsearchalgorithmforimplicitlyrepresentedgraphs.
Akeyissueindesigninganefcientexternal-memorygraphsearchalgorithmisduplicatedetection.
Ingraphsearch,thesamenodecanbereachedalongdifferentpaths,andpreventingredundantsearcheffortrequiresstoringalready-visitednodes,sothatthesearchalgorithmcanrec-ognizewhenitencountersanodethathasbeenalreadyexplored.
Becauseduplicatedetectionpotentiallyrequirescomparingeachnewlygeneratednodetoallstorednodes,itcanleadtocripplingdiskI/Oifalready-explorednodesarestoredondisk.
Allpreviousexternal-memorybreadth-rstsearchalgorithmslimitdiskI/Obyusingatechniquecalleddelayedduplicatedetectioninwhichanentiresetofnodesisexpandedbeforeperforminganyduplicatedetection.
Inthispaper,weintroduceanewapproach.
Insteadofdelayingduplicatedetection,welimitdiskI/Obylocalizingmem-oryreferencesbasedonthestructureofthesearchgraphasrevealedbyahigh-levelabstractionofthestatespace.
Com-putationalresultsonchallenginggraph-searchproblemsin-dicatethatthisapproach,whichwecallstructuredduplicatedetection,signicantlylimitsdiskI/O.
Forgraphswithsuf-cientlocalstructure,itallowsgraph-searchalgorithmstouseexternalmemoryalmostasefcientlyasinternalmemory.
BackgroundDevelopmentofexternal-memoryalgorithmsforvariouscomputationalproblemsisanactiveareaofresearch.
Be-causediskI/Oisseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory,theconceptofI/Ocom-plexityhasbeenintroducedtoanalyzetheperformanceofexternal-memoryalgorithms(Aggarwal&Vitter1988).
Itassumesatwo-levelmodelofmemoryinwhichinternalmemoryofsizeMissupplementedbyexternalmemory,andanI/OoperationmovesdatainblocksofsizeBbetweenin-ternalmemoryandexternalmemory,where1searchhasbeenconsid-eredbyseveralresearchers.
Inthetheoreticalcomputersci-SEARCH683encecommunity,researchershavefocusedonestablishingboundsonitsI/Ocomplexity.
Agraphisassumedtobeexplicitlyrepresentedusingadjacencylistsandstoredinex-ternalmemoryduetoitslargesize.
Insearchingexplicitgraphs,worst-caseI/Ocomplexitydependsonthemethodofgeneratingsuccessornodesaswellasthemethodofdu-plicatedetection,sinceadjacencylistsmustbereadintointernalmemorytodeterminethesuccessorsofanode.
MunagalaandRanade(1999)describeanexternal-memorybreadth-rstsearchalgorithmsthatadoptsanapproachtosuccessorgenerationthathaslinearcomplexityinthenum-berofedgesinthegraph.
MehlhornandMeyer(2002)achievesublinearcomplexityforsuccessorgenerationbyamoresophisticatedapproachthatinvolvespartitioningtheadjacencylistbasedonadecompositionofthegraphintoconnectedsubgraphs.
Bothalgorithmsusethesamemethodofduplicatedetection,calleddelayedduplicatedetection,inwhichasetofnodesisexpandedwithoutperformingdupli-catedetection,andthemulti-setofgeneratedsuccessorsissortedandcheckedforduplicatesinasingle,moreefcientoperation.
Thealgorithmsalternatebetweenthesetwosteps.
1.
successorgeneration,inwhichthealgorithmgeneratessuccessorsforasetofnodesonthesearchfrontierandappendsthesesuccessorstoale(orles)intheorderinwhichtheyaregenerated,withoutperforminganydupli-catedetection,and2.
delayedduplicatedetection,inwhichthele(s)ofsucces-sornodesaresorted(usuallybyanexternal-memorysortalgorithm)basedontheirindicesorstateencodings,fol-lowedbyascanandcompactionstageinwhichduplicatenodesinthesortedle(s)areeliminated.
IntheAIcommunity,Korf(2003a;2003b)recentlyde-scribedanexternal-memorybreadth-rstsearchalgorithmthatusesthesamemethodofdelayedduplicatedetection.
Hisalgorithmdiffersintwoways.
First,heconsiderssearchinimplicitgraphs,inkeepingwiththestandardAIapproachtostate-spacesearch.
Animplicitgraphisacompactrep-resentationofagraphintheformofastartnode,anodeexpansionfunctionthatgeneratestheimmediatesuccessorsofanode,andapredicatethattestswhetheranodeisagoalnode.
Insearchingimplicitgraphs,successorgener-ationdoesnotrequirediskI/O,andduplicatedetectionistheonlysourceofI/Ocomplexity.
AseconddifferenceisthatKorfbuildshisexternal-memorybreadth-rstsearchalgorithmontopoffrontiersearch,amemory-efcientsearchalgorithmthatdoesnotneedtostoreaClosedlist(Korf&Zhang2000).
StructuredduplicatedetectionWeproposeanalternativeapproachtoduplicatedetectioninexternal-memorygraphsearchthathassomesignicantadvantagesoverdelayedduplicatedetection.
Itcanalsobecombinedwithdelayedduplicatedetectioninsomecases.
Thecentralideaofourapproachistoexploitthestruc-tureofastate-spacegraphinordertolocalizememoryref-erencesduringduplicatedetection,andsowecalltheap-proachstructuredduplicatedetection.
Asanexample,con-siderthe(n21)sliding-tilepuzzle.
Whenanewnodeisgenerated,itcanonlydifferfromitsparentnodewithre-specttothepositionofthe"blank"byeitheroneroworonecolumn.
Therefore,incheckingforduplicates,itisnotnec-essarytocheckstorednodesforwhichthepositionofthe"blank"differsfromthatoftheparentnodebymorethanoneroworonecolumn.
Ifwepartitionthesetofstorednodesinthisway,wecansignicantlylimitthenumberofstorednodesthatneedtobecheckedtoguaranteethatallduplicatesarefound.
Wecontinuetousethismotivatingex-ampleinthefollowing,tohelpexplaintheidea.
Projectionfunctionandabstractstate-spacegraphOurapproachisbasedonusingastate-spaceprojectionfunctiontodecomposeastate-spacegraph,andcreateanab-stractstate-spacegraphthatrevealsthelocalstructureoftheoriginalgraphatahigh-level.
Stateabstractioninheuris-ticsearchiswellstudied,andisusedtocreateadmissibleheuristicsandtoorganizehierarchicalsearch(Holteetal.
1996).
Weuseanapproachtostateabstractionthatisverysimilartopreviousapproaches,buthasadifferentpurpose:localizingmemoryreferencesinduplicatedetection.
Astate-spaceprojectionfunctionisamany-to-onemap-pingfromtheoriginalstatespacetoanabstractstatespace,inwhicheachabstractstatecorrespondstoasetofstatesintheoriginalstatespace.
Ifastatexismappedtoanabstractstatey,thenyiscalledtheimageofx,andxiscalledthepre-imageofy.
Therearemanywaystodeneastate-spaceprojectionfunction.
Acommonapproachistoignoresomestatevariablesintheencodingoftheproblem.
Forexam-ple,asimplestate-spaceprojectionfunctionforthe(n21)sliding-tilepuzzlecanbedenedbyignoringthepositionsofalltilesandconsideringonlythepositionofthe"blank.
"Inthiscase,anabstractstatecorrespondstoallstateswiththesamepositionofthe"blank,"andtherearen2abstractstatescomparedton2!
/2statesintheoriginalstatespace.
State-spaceprojectionfunctionscanbedenedinasimilarwayforothersearchproblems(Klein&Manning2003).
Givenastate-spacegraphandstate-spaceprojectionfunc-tion,anabstractstate-spacegraphisconstructedasfollows.
1.
Thesetofnodes,calledabstractnodes,intheabstractstate-spacegraphcorrespondstothesetofabstractstates.
2.
Anabstractnodeyisasuccessorofanabstractnodeyiffthereexisttwostatesxandx,suchthata.
xisasuccessorofx,andb.
xandxarepreimagesofyandy,respectively.
Ify=y,itmeansthattheabstractnodeyhasaselfloop.
Forexample,Figure1(a)showstheninepossiblepositionsofthe"blank"intheEight-puzzle.
Figure1(b)showstheabstractstate-spacegraphcreatedbythesimplestate-spaceprojectionfunctionthatmapsastateintoanabstractstatebasedonlyonthepositionofthe"blank.
"EachabstractnodeBiinFigure1(b)correspondstothesetofstateswiththe"blank"locatedatpositioniinFigure1(a).
Duplicate-detectionscopeAlthoughtransformingastate-spacegraphintoanabstractstate-spacegraphcanresultinexponentialreductioninthe684SEARCHFigure1:Panel(a)showsallpossiblepositionsofthe"blank"fortheEight-puzzle.
Panel(b)showsanexampleofanabstractstate-spacegraphfortheEight-puzzle.
Thisdenitionofthestate-spaceprojectionfunctionisbasedonthepositionofthe"blank"only.
sizeofthegraph,thelocalstructureoftheoriginalstate-spacegraphcanbelargelypreserved.
Forexample,abstractnodeB0inFigure1(b)hasonlytwosuccessors;abstractnodesB1andB3.
Thiscapturesthefactthatasinglemoveofatilecanonlychangethepositionofthe"blank"byeitheronecolumn(B0→B1inthiscase)oronerow(B0→B3inthiscase).
Denition1Anabstractstate-spacegraphhaslocalstruc-tureifithasboundedout-degree.
Forthe(n21)sliding-tilepuzzleusingthissimplestate-spaceprojectionfunction,theout-degreeoftheabstractstate-spacegraphisboundedby4,foreveryn.
Thelocalstructureofanabstractstate-spacegraphcanbeusedtorestrictthescopeofduplicatedetectionsothatonlyafractionofstorednodesneedstobecheckedforduplicates,whilestillguaranteeingthatallduplicatesarefound.
Forexample,whengeneratingsuccessorsfornodesthatarepre-imagesofabstractnodeB0inFigure2,thealgorithmonlyneedstocheckforduplicatesagainststorednodesthatarepre-imagesofabstractnodeB1orB3.
Letanabstractnodey=p(x)betheimageofanodexunderastate-spaceprojectionfunctionp(·)andletsuccessors(y)bethesetofsuccessorabstractnodesofyintheabstractstate-spacegraph.
Denition2Theduplicate-detectionscopeofanodexun-derastate-spaceprojectionfunctionp(·)correspondstotheunionofsetsofstorednodesthatarepre-imagesofanab-stractnodeysuchthaty∈successors(p(x)),thatis,y∈successors(p(x))p1(y)wherep1(y)isthesetofstorednodesthatarepre-imagesofy.
Theorem1Theduplicate-detectionscopeofanodecon-tainsallstoredduplicatesofthesuccessorsofthenode.
Proof:Supposethatanodexhasapreviously-generatedsuccessornodexthatisnotincludedintheduplicate-detectionscopeofnodex.
Lety=p(x)betheimageofxunderthestate-spaceprojectionfunctionp(·).
Accord-ingtoDenition2,wehavey/∈successors(p(x)).
Butaccordingtothedenitionofanabstractstate-spacegraph,ifxisasuccessorofx,thenp(x)mustbeasuccessorofFigure2:Theduplicate-detectionscopeofnodesthatarepre-imagesofabstractnodeB0includesnodesthatarepre-imagesofabstractnodeB1orB3.
Whenexpandingnodesthatarepre-imagesofabstractnodeB0,thesearchalgorithmcanuseinternalmemorytostorenodesthatarepre-imagesofabstractnodeB0,B1,orB3,anduseexternalmemorytostoretherestofthenodes.
p(x).
Inotherwords,y∈successors(p(x)).
Sincethisleadstoacontradiction,Theorem1musthold.
2Theconceptofduplicate-detectionscopecanbeveryuse-fulinexternal-memorygraphsearch,becauseitsuggeststhatasearchalgorithmuseinternalmemorytostorenodeswithintheduplicate-detectionscopeofasetofexpandingnodes,anduseexternalmemorytostoreothernodes,wheninternalmemoryisfull.
Figure2illustratesthisidea.
Weusethetermnblocktorefertoaset(or"block")ofnodesintheoriginalstatespacethatcorrespondto(i.
e.
,arepre-imagesof)anabstractnode.
Ifanabstractstate-spacegraphhaslocalstructure,thentheduplicate-detectionscopeofanynodeconsistsofaboundednumberofnblocks,andthelargestduplicatedetectionscopecanbeasmallfractionoftheoverallnumberofstorednodes.
Notethatthelargestduplicatedetectionscopeestablishesaminimuminternal-memoryrequirementforstructuredduplicatedetection.
SearchusingstructuredduplicatedetectionWenowdescribehowtointegratestructuredduplicatede-tectionintoastate-spacesearchalgorithm.
Similartodelayedduplicatedetection,structureddupli-catedetectionisusedwithasearchalgorithmthatexpandsasetofnodesatatime,suchthattheunderlyingsearchstrat-egyisconsistentwithexpandingthenodesinthissetinanyorder.
Thisisstraightforwardinbreadth-rstsearch,whereasetofnodescorrespondstoallnodesinthesamelayerofthebreadth-rstsearchgraph.
(Welaterexplainhowitcanapplytobest-rstsearch.
)Giventhissetofnodes,struc-turedduplicatedetectiondeterminesanorderofexpansionthatminimizesI/Ocomplexity,byexploitinglocalstructurerevealedbytheabstractstate-spacegraph.
Tosupportstructuredduplicatedetection,thesetofstorednodes(i.
e.
,nodesintheOpenandClosedlists)ispartitionedbasedonanabstractstate-spacegraph.
Eachnblockinthepartitionconsistsofallstorednodesthatarepre-imagesofthesameabstractnode(i.
e.
,eachnblockcorrespondstoanabstractnode).
Thislocalstructureisexploitedasfollows.
First,nodesinthesamenblockareexpandedtogether,i.
e.
,consecutively.
Thisimproveslocalityofmemoryref-erencebecauseallnodesinthesamenblockhavethesameduplicate-detectionscope.
ButitleavesundeterminedtheSEARCH685orderinwhichtoconsidernblocks.
Ingeneral,iftheduplicate-detectionscopesoftwonblocksarealmostthesame,theyshouldbeexpandedclosetoeachother,inordertominimizethenumberofpossibleI/Ooperations.
Nblockscorrespondtoabstractnodes,andabstractnodesthatareneighborsintheabstractstate-spacegraphtendtohavesim-ilarduplicate-detectionscopes.
Therefore,wechoosetoex-pandnblocksinanorderthatreectsneighborrelationsintheabstractstate-spacespacegraph.
Asimplewaytodothisistoexpandthenblocksinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph.
Wheninternalmemoryisfull,thesearchalgorithmmustremovefrominternalmemoryoneormorenblocksthatdonotbelongtotheduplicate-detectionscopeofnodescurrentlybeingexpanded.
Immediatelybeforeexpandingnodesinadifferentnblock,itmustcheckifsomenblocksintheirduplicate-detectionscopearemissingfrominternalmemory,andifso,readthemfromexternalmemory.
Be-causereadingannblockintointernalmemoryoftenresultsinwritinganothernblocktoexternalmemory,werefertothesepairsofreadandwriteoperationsasnblockreplace-ments.
Exceptfornblocksthatarepartoftheduplicate-detectionscopeofnodesbeingexpanded,anynblockcanpotentiallybemovedfrominternalmemorytoexternalmemory.
De-cidingwhichnblockstoremoveisidenticaltothepage-replacementstrategyofvirtualmemory,exceptthatnowtheunitofreplacementisannblockinsteadofapage.
Thus,wecallitannblock-replacementstrategy.
Itiswell-knownthattheoptimalreplacementstrategyalwaysremovesthepage(ornblock)thatwillnotbeusedforthelongesttime(Be-lady1966).
Implementingsuchastrategyisgenerallyim-possible,asitrequiresfutureknowledgeoftheorderinwhichpages(nblocks)willbeneeded,andleast-recently-usedisoftenthebeststrategyinpractice(Sleator&Tar-jan1985).
However,itispossibletocomputeanoptimalnblock-replacementstrategyinourcase,sincetheorderinwhichnblocksareexpandedisgivenbythebreadth-rsttraversalorderofanabstractstate-spacegraph.
Thus,givenasetofnodestoexpand,structuredduplicatedetectionexpandstheminanordersuchthat(1)nodesinthesamenblockareexpandedtogether,(2)nblocksareconsid-eredinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph,and,(3)wheninternalmemorybecomesfull,selectionofwhichnblockstowritetodiskisbasedonapre-computedoptimalnblock-replacementstrat-egy,orelse,theleast-recentlyusedstrategy.
Ititstraightforwardtousestructuredduplicatedetectionwithbreadth-rstsearch,wherethesetofnodesbeingex-pandedisalayerofthebreadth-rstsearchgraph.
Itisalsopossibletouseitwithabest-rstsearchalgorithmsuchasA*.
Insolvingasearchproblemwithmanyties,theorderinwhichA*expandsnodeswiththesamef-valueisnon-deterministic,andstructuredduplicatedetectioncandeter-mineanorderthatminimizesI/Ocomplexity.
The(n21)sliding-tilepuzzleisanexampleofadomainwithmanyties.
Forsearchproblemsinwhichtherearenotmanyties,itispossibletousestructuredduplicatedetectionaspartofaslightly-modiedversionofA*,inwhichA*selectsasetofnodestoexpandcontainingnodeswithalmostthesamef-value–forexample,thebestknodesontheOpenlist,wherekisasuitablylargenumber.
Structuredduplicatede-tectioncandeterminetheorderinwhichtoexpandthissetofnodes.
Althoughitmayexpandnodesinanorderthatisnotstrictlybest-rst,improvedperformancefromstructureddu-plicatedetectionmayoutweighanincreaseinthenumberofnodesexpanded.
Thismodicationofthebest-rstexpan-sionorderofA*toimprovelocalityofmemoryreferenceswaspreviouslyproposedbyEdelkampandSchr¨odl(2000)asawayofimprovinguseofvirtualmemory.
I/OcomplexityI/Ocomplexityistheprimaryfactorthataffectsthespeedofexternal-memoryalgorithms.
TheI/OcomplexityofsearchinanimplicitgraphdependsentirelyontheI/Ocomplexityofduplicatedetection.
Theworst-caseI/OcomplexityofdelayedduplicatedetectionisO(n+mBlogMB(n+m)),wherenisthenumberofnodesinthestate-spacegraph,misthenumberofedges,Bisthesizeofadiskblock,Misthesizeofinternalmemory,andO(xBlogMBx)istheworst-caseI/Ocomplexityofexternallysortingxelements(Munagala&Ranade1999).
Forstructuredduplicatedetection,wehavethisresult.
Theorem2Ifeveryduplicate-detectionscopetsininter-nalmemory,theworst-caseI/Ocomplexityofstructureddu-plicatedetectionisO(nB·E).
Inthisresult,Eisthenumberofdirectededgesintheab-stractstate-spacegraph.
(Tokeeptheanalysissimple,weconsideranundirectededgeastworeciprocallydirectededges.
)ThetheoremfollowsfromthefactthatEistheworst-casenumberofnblockreplacementsthatmustbedoneintraversingtheabstractstate-spacegraph,andtheworst-casenumberofI/Ooperationsneededpernblockre-placementisthesizeofthelargestnblockdividedbythediskblocksize,wherenboundsthesizeofthelargestnblock.
Thefactor1Bisthesameinbothanalyses.
Sinceanab-stractstate-spacegraphistypicallyexponentiallysmallerthantheoriginalstate-spacegraph,thefactorEforstruc-turedduplicatedetectioniscomparabletologMB(n+m)fordelayedduplicateddetection.
ThedifferenceinI/Ocom-plexityisthedifferencebetweenthefactorn+mfordelayedduplicatedetection,andthefactornforstructuredduplicatedetection.
Thepresenceofn+minthecomplexityanalysisfordelayedduplicatedetection,incontrasttonforstruc-turedduplicatedetection,canbeinterpretedasapenaltyfordelayingduplicatedetection,sinceitboundsthecardinalityofthemulti-setofsuccessorsthatisgeneratedbyexpandingasetofnodesbeforeperformingduplicatedetection.
Inahighly-connectedgraph,mismuchlargerthann,andthepenaltyfordelayingduplicatedetectioncanbese-vere.
Itfollowsthatdelayedduplicatedetectionworksbestinsparsegraphs.
However,manyimportantsearchproblemsdonothavesparsegraphs.
Anexampleismultiplesequencealignment,amotivatingproblemforfrontiersearch(Korf&Zhang2000).
Anadvantageofstructuredduplicatede-tectionisthatitworksequallywellinsparseandhighly-connectedgraphs.
686SEARCH#SolIntMemExtMemExpSecs1766564,62816,389,872279,167,41180149591,382,50420,557,691345,487,8638985364481,53312,588,843224,545,8536415655228,33412,989,362208,969,4456655957414,77513,829,299228,945,35167160661,738,02256,436,901978,819,6463,06966611,355,26420,699,891368,181,73597382621,410,29246,329,201765,608,9892,38988651,808,59177,711,2351,360,544,0934,4569257371,77912,505,737213,445,215642Table1:Performanceonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle.
Columnsshowtheinstancenumber(#);solutionlength(Sol);peaknumberofnodesstoredininternalmemory(IntMem);peaknumberofnodesstoredinexternalmemory(ExtMem),numberofnodeexpansions(Exp),andrunningtimeinCPUseconds(Secs).
Unlikedelayedduplicatedetection,structuredduplicatedetectionhasaminimuminternal-memoryrequirement.
Theorem2holdsonlyifeveryduplicatedetectionscopetsininternalmemory.
So,whetheritholdsdependspartlyonthesizeofinternalmemory,andpartlyonthelocalityofthegraph,andhowwellitiscapturedinanabstractstate-spacegraph.
Ifthelargestduplicatedetectionscopedoesnottininternalmemory,itispossibletocombinestructureddupli-catedetectionwithdelayedduplicatedetection.
Givenasetofnodestoexpand,thesetcanbepartitionedbasedontheabstractstate-spacegraph,anddelayedduplicatedetectioncanbeperformedseparatelyonthenodesineachnblock,asawayofleveraginglocalstructuretoimproveperformance.
ComputationalresultsHowefcientlytheunderlyinggraph-searchalgorithmusesinternalmemoryhasaneffectondiskI/O,sinceitaf-fectshowmuchtotalmemoryisneeded.
Forthisreason,Korf(2003a;2003b)builthisexternal-memorygraphsearchalgorithmontopoffrontiersearch,averymemory-efcientgraph-searchalgorithm.
Inourexperiments,wealsocom-binestructuredduplicatedetectionwithmemory-efcientgraph-searchalgorithms.
Adifferenceisthatwecon-siderheuristicsearch,whereaspreviousworkonexternal-memorygraphsearchconsidersbreadth-rstsearchonly.
FortheresultsinTables1and2,weranthesearchalgo-rithmsinamodethatminimizesuseofinternalmemory.
Sothepeakamountofinternalmemoryusedcorrespondstotheminimuminternal-memoryrequirementsofthealgorithms,usingstructuredduplicatedetectionandagivenabstractionofthestatespace.
Ordinarily,thealgorithmswoulduseallavailableRAMbeforeusingdisk.
Thealgorithmswererunona2.
4GHzIntelPentiumwith2gigabytesofRAManda7200RPMSeagatediskwith120gigabytesofstorage.
Fifteen-puzzleTosolvetheFifteenpuzzle,weusestructuredduplicatedetectiontogetherwithabreadth-rstheuristicsearchal-gorithmthatusesdivide-and-conquersolutionreconstruc-tion,calledBreadth-FirstIterative-DeepeningA*(Zhou&Hansen2004).
Unlikebrute-forcebreadth-rstsearch,thisNameCostIntMemExtMemExpSecs1aboA8,4838K130K4,207K451amk33,96029K555K31,704K3731csy14,1606K52K1,943K251ezm40,73317K154K6,186K731gtr58,010188K7,468K1,094,936K22,3811idy7,88819K415K10,246K1101pfc15,71812K153K6,600K711tis37,58124K383K29,778K3571wit13,97953K1,513K80,591K1,192actin52,117102K2,783K240,168K3,972Table2:Performanceinaligninggroupsof5proteinsequencesfromreferenceset1ofBAliBASE(Thompson,Plewniak,&Poch1999).
Columnsshownameofinstance,costofoptimalalignment,peaknumberofnodesstoredininternalmemory(inthousands),peaknumberofnodesstoredinexternalmemory(inthousands),numberofnodeexpansions(inthousands),andCPUseconds.
algorithmusesupperandlowerboundstoprunenodesthatcannotbepartofanoptimalsolution.
Table1showshowthealgorithmperformsonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle(Korf1985).
Weusedastate-spaceprojectionfunctionthatgroupsto-getherstatesbasedonthepositionsoftiles15and8,plusthepositionofthe"blank,"creatinganabstractstate-spacegraphwith16·15·14=3360nodes.
Basedonthispartitionofthestatespace,structuredduplicatedetectionreducestheinternal-memoryrequirementofthesearchalgorithmbyafactorofbetween16and58times.
Inexchange,itincreasesrunningtimebyonlyabout24%insolvingtheseexam-ples.
Usingbothstructuredduplicatedetectionanddivide-and-conquersolutionreconstruction,all100instancesofthe15-puzzlearesolvedusingnomorethan42megabytesofRAM,withouteverre-expandinganodeinthesameitera-tion.
Basedonthenumberofdistinctnodesexpanded,A*wouldneed28gigabytesofRAMjusttostoretheClosedlistinsolvinginstance88.
MultiplesequencealignmentThemultiplesequencealignmentproblemisanexampleofasearchproblemthatcannotbesolvedefcientlyus-ingdelayedduplicatedetectionbecauseitssearchgraphisveryhighly-connected.
Thesearchgraphforthemulti-plesequencealignmentproblemisann-dimensionalhyper-lattice,wherenisthenumberofsequencesbeingaligned.
WeimplementedstructuredduplicatedetectionontopofasearchalgorithmcalledSweepA*,whichusesdivide-and-conquersolutionreconstructiontolimituseofmem-ory(Zhou&Hansen2003).
1Wetestedtheresultingal-gorithmonaselectionofdifcultmultiplesequencealign-mentproblemsfromreferenceset1ofBAliBASE,awidely-usedbenchmark(Thompson,Plewniak,&Poch1999).
All1SweepA*isaspecializedsearchalgorithmformultiplese-quencealignmentthatexpandsnodesonalayer-by-layerbasis,whichmakesitagoodtforstructuredduplicatedetection.
Itisalsoverymemory-efcient.
Forcomparison,frontiersearch(Korf&Zhang2000)cannotsolveinstances1gtr,1wit,oractininTa-ble2within2gigabytesofRAM,andusesanaverageof25timesmorememorythanSweepA*insolvingtheotherinstances.
SEARCH687problemsinvolvedaligning5proteinsequences.
OurcostfunctionwasaDayhoffsubstitutionmatrixwithlineargappenaltyof8,andweusedapairwisealignmentadmissibleheuristic.
Tocreateanabstractstate-spacegraph,weusedastate-spaceprojectionfunctionthatignoresallbut3se-quences.
ThiscreatesanabstractstatespacewithO(l3)ab-stractnodes,wherelistheaveragelengthofasequence,comparedtoO(l5)nodesintheoriginalstate-spacegraph.
Table2showsthatstructuredduplicatedetectionreducestheinternal-memoryrequirementsofSweepA*byafactorofbetween10and40times.
Usingdisk,thealgorithmneedsonly20megabytesofRAMtosolveallinstancesinTable2,includingthespaceforthepairwiseheuristic.
Interestingly,theexternal-memoryversionofSweepA*runsfasterthananinternal-memoryversionofSweepA*thatdoesnotusestructuredduplicatedetection,byanaverageof72%!
Thereasonisthatstructuredduplicatedetectionrunsfasterinin-ternalmemorythanun-structuredduplicatedetectionduetolocalityofmemoryreferences,andthisspeedupoutweighstheextratimefordiskI/O.
Thespeedupismoreevidentinmultiplesequencealignmentthanthesliding-tilepuz-zlebecausethemultiplesequencealignmentsearchgraphismuchmorehighly-connected,whichsignicantlyincreasesthenumberofduplicatesgeneratedpernodeexpansion.
Speedingupinternal-memorysearchWeemphasizethatforboththeFifteenpuzzleandmulti-plesequencealignment,structuredduplicatedetectionim-provestheperformanceofinternal-memorygraphsearch,inadditiontousingexternalmemoryefciently.
Thisisbe-causeeachtimethesearchalgorithmchecksforduplicates,itonlyneedstocheckasmallsubsetofthestorednodes.
FortheFifteenpuzzle,thisleadstoa16%speedupininternal-memoryperformance.
Formultiplesequencealignment,itcutstherunningtimeofinternal-memorysearchinhalf.
Again,thespeedupismuchgreaterformultiplesequencealignmentbecauseinaveryhighly-connectedgraph,manymoreduplicatesaregenerated.
Thisunderscoresabenetofstructuredduplicatedetection.
Unlikedelayedduplicatedetection,itismoreeffectiveinhighly-connectedgraphs,wheretheproblemofduplicatedetectionismorecrucial.
ConclusionWehaveintroducedstructuredduplicatedetection,anovelapproachtolocalizingmemoryreferencesinduplicatede-tection,andshowedthatitcanreducetheI/Ocomplexityofexternal-memorygraphsearch,aswellasincreasethespeedofduplicatedetectionininternal-memorysearch.
Structuredduplicatedetectionhassomeadvantagesoverdelayedduplicatedetection,theonlypreviousapproachtoexternal-memorygraphsearch.
Inparticular,itismoreef-fectiveinhighly-connectedgraphs,anditcanexploitlocalgraphstructurethatisnotconsideredbydelayedduplicatedetection.
ButstructuredduplicatedetectionanddelayedduplicatedetectioncanalsobeviewedascomplementaryapproachestoreducingI/Ocomplexity,andcanbeusedto-getherfordifcultsearchproblems.
Ifasearchalgorithmcanleverageenoughlocalityinastate-spacegraphsothateveryduplicate-detectionscopetsininternalmemory,itisunnecessarytoeverdelayduplicatedetection,andstruc-turedduplicatedetectioncanmanageexternalmemorybyitself.
Butifanyduplicate-detectionscopedoesnottinin-ternalmemory(perhapsduetolackofsufcientlocalstruc-tureinthegraph),structuredduplicatedetectionbyitselfisnotsufcient.
Sincedelayedduplicatedetectiondoesnothaveaminimummemoryrequirement,itcanbeusedinthiscase.
Structuredduplicatedetectioncanbeusedtogetherwithdelayedduplicatedetectiontoimproveitsperformance,byleveraginggraphlocalityasmuchaspossible.
AcknowledgmentsWethanktheanonymousreviewersforhelpfulcomments.
ThisworkwassupportedinpartbyNSFgrantIIS-9984952andNASAgrantNAG-2-1463.
ReferencesAggarwal,A.
,andVitter,J.
1988.
Theinput/outputcomplexityofsortingandrelatedproblems.
Comm.
oftheACM31(9):1116–27.
Belady,L.
1966.
Astudyofreplacementalgorithmsforvirtualstorage.
IBMSystemsJournal5:78–101.
Edelkamp,S.
,andSchr¨odl,S.
2000.
LocalizingA*.
InProc.
ofthe17thNationalConferenceonArticialIntelligence,885–890.
Holte,R.
;Mkadmi,T.
;Zimmer,R.
;andMacDonald,A.
1996.
Speedingupproblemsolvingbyabstraction:Agraph-orientedapproach.
ArticialIntelligence85(1–2):321–361.
Klein,D.
,andManning,C.
2003.
FactoredA*searchformodelsoversequencesandtrees.
InProceedingsofthe17thInternationalJointConferenceonArticialIntelligence,1246–1251.
Korf,R.
,andZhang,W.
2000.
Divide-and-conquerfrontiersearchappliedtooptimalsequencealignment.
InProceedingsofthe17thNationalConferenceonArticialIntelligence,910–916.
Korf,R.
1985.
Depth-rstiterativedeepening:Anoptimalad-missibletreesearch.
ArticialIntelligence27:97–109.
Korf,R.
2003a.
Breadth-rstfrontiersearchwithdelayeddupli-catedetection.
InProceedingsoftheWorkshoponModelCheck-ingandArticialIntelligenceatIJCAI-03,87–92.
Korf,R.
2003b.
Delayedduplicatedetection:Extendedabstract.
InProceedingsofthe17thInternationalJointConferenceonAr-ticialIntelligence,1539–1541.
Mehlhorn,K.
,andMeyer,U.
2002.
External-memorybreadth-rstsearchwithsublinearI/O.
InProceedingsofthe10thAnnualEuropeanSymposiumonAlgorithms,723–735.
Munagala,K.
,andRanade,A.
1999.
I/O-complexityofgraphalgorithms.
InProceedingsofthe10thSymposiumondiscretealgorithms,687–694.
ACM-SIAM.
Sleator,D.
,andTarjan,R.
1985.
Amortizedefciencyoflistupdateandpagingrules.
CommunicationsoftheACM28:202–8.
Thompson,J.
;Plewniak,F.
;andPoch,O.
1999.
BAliBASE:Abenchmarkalignmentdatabasefortheevaluationofmultiplealignmentprograms.
Bioinformatics15(1):87–88.
Zhou,R.
,andHansen,E.
2003.
SweepA*:Space-efcientheuristicsearchinpartiallyorderedgraphs.
InProc.
of15thIEEEInternationalConf.
onToolswithArticialIntelligence,427–434.
Zhou,R.
,andHansen,E.
2004.
Breadth-rstheuristicsearch.
InProceedingsofthe14thInternationalConferenceonAutomatedPlanningandScheduling.
688SEARCH

麻花云-香港CN2云服务器,安徽BGP线路,安徽移动大带宽!全系6折!

一、麻花云官网点击直达麻花云官方网站二、活动方案优惠码:专属优惠码:F1B07B 享受85折优惠。点击访问活动链接最新活动 :五一狂欢 惠战到底 香港云主机 1.9折起香港特价体验云主机CN2 云服务器最新上线KVM架构,,默认40G SSD,+10G自带一个IPv4,免费10Gbps防御,CPU内存带宽价格购买1核1G1M19元首月链接2核2G 2M92元/3个月链接2核4G3M112元/3个月...

无忧云(25元/月),国内BGP高防云服务器 2核2G5M

无忧云官网无忧云怎么样 无忧云服务器好不好 无忧云值不值得购买 无忧云,无忧云是一家成立于2017年的老牌商家旗下的服务器销售品牌,现由深圳市云上无忧网络科技有限公司运营,是正规持证IDC/ISP/IRCS商家,主要销售国内、中国香港、国外服务器产品,线路有腾讯云国外线路、自营香港CN2线路等,都是中国大陆直连线路,非常适合免北岸建站业务需求和各种负载较高的项目,同时国内服务器也有多个BGP以及高...

IMIDC彩虹数据:日本站群多ip服务器促销;30Mbps带宽直连不限流量,$88/月

imidc怎么样?imidc彩虹数据或彩虹网络现在促销旗下日本多IP站群独立服务器,原价159美元的机器现在只需要88美元,而且给13个独立IPv4,30Mbps直连带宽,不限制月流量!IMIDC又名为彩虹数据,rainbow cloud,香港本土运营商,全线产品都是商家自营的,自有IP网络资源等,提供的产品包括VPS主机、独立服务器、站群独立服务器等,数据中心区域包括香港、日本、台湾、美国和南非...

graphsearch为你推荐
中国微信5思科ipad支持ipad支持ipadCTios特斯拉苹果5平台操作使用手册tcpip上的netbiostcpip上的netbios是什么用的,有安全隐患吗?开启还是关上win7如何关闭445端口如何关闭WIN7自动配置 IPV4 地址 169.254360chrome使用360急速浏览器,360chrome进程结束不了
子域名查询 备案未注册域名 新加坡主机 directspace 息壤备案 网站保姆 2017年万圣节 私有云存储 dux admit的用法 什么是服务器托管 佛山高防服务器 最好的qq空间 hdd 湖南idc 免费个人主页 摩尔庄园注册 睿云 七十九刀 .htaccess 更多