namesgraph

graphsearch 时间:2021-02-11 阅读:()

Front.
Comput.
Sci.
DOIRESEARCHARTICLEBigGraphSearch:ChallengesandTechniquesShuaiMA,JiaLI,ChunmingHU,XuelianLIN,JinpengHUAIStateKeyLaboratoryofSoftwareDevelopmentEnvironmentSchoolofComputerScienceandEngineeringBeihangUniversity,Beijing100191,ChinacHigherEducationPressandSpringer-VerlagBerlinHeidelberg2012AbstractOnonehand,comparedwithtraditionalrelationalandXMLmodels,graphshavemoreexpressivepowerandarewidelyusedtoday.
Ontheotherhand,variousapplicationsofsocialcomputingtriggerthepressingneedofanewsearchparadigm.
Inthisarticle,wearguethatbiggraphsearchistheonellingthisgap.
Werstintroducetheapplicationofgraphsearchinvariousscenarios.
Wethenformalizethegraphsearchproblem,andgiveananalysisofgraphsearchfromanevolutionarypointofview,followedbytheevidencesfromboththeindustryandacademia.
Afterthat,weanalyzethedicultiesandchallengesofbiggraphsearch.
Finally,wepresentthreeclassesoftechniquestowardsbiggraphsearch:querytechniques,datatechniquesanddistributedcomputingtechniques.
Keywordsgraphsearch,bigdata,querytechniques,datatechniques,distributedcomputing1IntroductionWiththerapiddevelopmentofsocialcomputing,Internetandvariousapplicationshavebroughtaboutexponentiallygrowingdata.
AccordingtotherecentreportoftheUN'sInternationalTelecommunicationsUnion(ITU),Internetuserswillhit3billiongloballybytheendof2014[1];ThetotalnumberofmonthlyactiveFacebookusershasreachedover1.
3billion,andtheincrementofitsusersfrom2012to2013isabout22%[2].
Alltheseindicatethecomingofaneraofbigdata.
Indeed,"dataarebecomingthenewrawReceivedNovember,2014;acceptedJanuary,2015E-mail:hucm@buaa.
edu.
cnmaterialofbusiness:aneconomicinputalmostonaparwithcapitalandlabour.
"[3].
HowtolterunnecessarydataandndthedesiredinformationsothatonecouldeasilymaketimelyandaccuratedecisionsThishasbecomeoneofthemostpressingneedsinsuchabigdataera.
ComparedwithtraditionalrelationalandXMLmodels,graphshavemoreexpressivepower,andplayanimportantroleinmanyapplications,suchassocialnetworks,biologicaldataanalyses,recommendersystems,complexobjectidenticationandsoftwareplagiarismdetection.
Essentially,thisisbecausethecoredatainvolvedintheseapplicationscanbeconvenientlyrepresentedasgraphs.
Forinstance,asocialnetwork(e.
g.
,Facebook[4],Twitter[5]andWeibo[6])constitutesallkindsofsocialusers/activities,whichisessentiallyagraph,whosenodesdenoteusers/activitiesandedgesdenotetheirrelationships,suchasfriendships,respectively.
Thewideuseofgraphshasbroughtabouttheemergenceofbiggraphsearch,i.
e.
,retrievinginformationfrombiggraphsinatimelyandaccuratemanner,whichhasdrawnmoreandmoreattentionfromboththeindustryandacademia[7–9].
Werstgiveanoverviewoftheapplicationscenariosofgraphsearch.
(1)SocialnetworksandtheWeb.
Nowadays,therapiddevelopmentoftheWebandsocialnetworkshasmadesignicantinuencesonpeople'ssocialandpersonalbehaviors.
Takeforinstance,Facebook:(a)thetotalnumberofitsusersisverylarge:therearemorethan1.
3billionmonthlyactiveusersand0.
68billionmobileuserstillJune2014;(b)therelationsamongusersandotherobjectsaretight:auserhas130friendsandlikes80pagesonaverage;(c)thereisalargeamountofinformationdisseminationon2ShuaiMaetal.
BigGraphSearch:ChallengesandTechniquesFigure1ArecommendationnetworkFacebook:morethan4.
75billionpiecesofcontentareshareddaily;(d)thesitevisitofFacebookisquitefrequent:23%ofuserscheckFacebook5timesormoredaily,andauserspends20minutesonthesitepervisitonaverage[2].
Asmentionedearlier,socialnetworkscanbeeasilyrepresentedbygraphs,whichcomeswithallkindsofgraphsearchtechniques[10–13],includingneighborqueryandsocialnetworkcompression[14].
Similartosocialnetworks,theWebcanbeexpressedasabiggraphaswell,whosenodesdenoteWebpages,andedgesindicatehyperlinkrelationshipsbetweenWebpages.
Infact,theWebsiteclassicationandWebmirrordetectionproblemscanbetreatedasthegraphclassication[15]andgraphmatchingproblems[16],respectively.
(2)Recommendersystems.
Recommendationhasfounditsusageinmanyapplications,suchassocialmatchingsystems,andgraphsearchisausefultoolforrecommendation[17].
Considertheexamplethataheadhunterwantstondabiologist(Bio)tohelpagroupofsoftwareengineers(SEs)analyzegeneticdata[18,19].
Todothis,sheusesanexpertiserecommendationnetworkG,asdepictedinFig.
1,inwhichnodesdenotepersonslabeledwiththeirexpertise,andedgesindicaterecommendations,e.
g.
HR1recommendsBio1,andAI1recommendsDM1.
ThebiologistBioneededisspeciedwithapatterngraphQ,alsoshowninFig.
1.
WecouldndthatBiohastoberecommendedby:(a)anHR,anSEandadataminingexpert(DM)together,asdataminingknowledgeisrequiredforthejob,(b)theSEisalsorecommendedbytheHR,and(c)thereisanarticialintelligenceexpert(AI)whorecommendstheDMandisrecommendedbytheDM.
BasedonthepatterngraphQanddatagraphG,theheadhuntercouldndthesuitablebiologistinGwhomeetstherequirements,byutilizinggraphsearchtechniquesdevelopedin[18,19].
(3)Complexobjectidentication.
DataqualityproblemcostsU.
S.
businessmorethan$600billionayear[20],andFigure2Arouteplanningexampledatacleaningtechniquescanhelpmitigatethelossestoalargeextent,e.
g.
,itdeliversanoverallbusinessvalueofmorethan"600millionGBP"eachyearatBTbyadoptingdatacleaningtools[21].
Datacleaningtypicallycontainstwocentralissues:recordmatchinganddatarepairing[22].
Complexobjectidenticationisthemostdicultissueinrecordmatching,whichistoidentifycomplexobjectsreferringtothesameentityinaphysicalworld.
Onepossiblesolutionistorepresentcomplexobjectsasgraphs,andthentoidentifythesameonesbyutilizinggraphsearchtechniques,suchassubgraphisomorphismandgraphhomomorphism[16,23].
(4)Softwareplagiarismdetection.
Withthepopularityofopen-sourcesoftware,itgetsmucheasierforalessself-disciplineddevelopertouse(partof)othersoftwarewithoutgivingpropercredits.
Traditionalplagiarismdetectiontoolsarenotadequateforndingserioussoftwareplagiarismcases.
Anovelplagiarismdetectiontoolhasbeendevelopedbasedongraphsearchtechniques[24].
Firstly,ittransformsthesourceandtargetprogramsintoprogramdependencegraphs[25].
Secondly,itteststhesimilarityofthetwoprogramdependencegraphswithsubgraphisomorphism[23].
Finally,ifthegraphsimilarityishighenough,itconcludestheplagiarism.
Therationalbehindthisisthatthecorecontrolowofprograms,reectedbytheirprogramdependencegraphs,arehardlytobemodied.
(5)Tracrouteplanning.
Graphsearchisacommonpracticeintransportationnetworks,duetothewideapplicationoflocation-basedservices.
Consideranexampletakenfrom[26].
MarkisadriverintheU.
S.
whowantstotravelfromIrvinetoRiversideinCalifornia.
(a)IfMarkwantstoreachRiversidebyhiscarintheshortesttime,thiscanbetreatedastheclassicalshortestpathproblem[27],Front.
Comput.
Sci.
3Figure3TheevolutionofinformationsearchbasedonwhichMarkcangureouthisbestsolutionfromIrvinetoRiversideisbytravelingalongStateRoute261,asillustratedbyFig.
2.
(b)However,ifMarkdrivesatruckcarryingwithhazardousmaterials,whichmaynotbeallowedtocrossoversomebridgesorrailroadcrossings,thenapatterngraphapproachspecifyingrouteconstraintswithregularexpressionsmaybeneededtondanoptimaltransportroute[28].
Inaddition,graphsearchtechniqueshavealsobeenadoptedinvirtualnetworks[29],patternrecognition[30]andVLSIdesign[31],amongotherthings.
Organization.
Intherestofthisarticle,werstgiveaformaldenitionofgraphsearchandexplainwhyitisimportantinSection2.
ThenweintroducethechallengesofbiggraphsearchinSection3,followedbytechniquestowardsbiggraphsearchinSection4.
Finally,weconcludeinSection5.
2GraphSearch,WhyBotherInthissection,werstformalizetheconceptofgraphsearch.
Thenwegiveananalysisofgraphsearchfromanevolutionarypointofviewandpointoutitsurgentneed,followedbytheevidencesfromtheindustryandacademia.
2.
1WhatisGraphSearchWerstformalizetheconceptofgraphsearch.
Graphsearch.
GiventwographsGp,alsoreferredtoasthepatterngraphandGd,alsoreferredtoasthedatagraph,graphsearchis(1)todecidewhetherGp"matches"Gd,or(2)toidentifythesubgraphsofGdthatGp"matches".
Heregraphsconsistofnodesandedges,bothofwhichareoftenattachedwithlabelsindicatingallkindsofinformation.
Patterngraphsareusuallysmall,e.
g.
,withseveralordozensofnodes/edges,whiledatagraphsareoftenbig,e.
g.
,withbillionsofnodes/edges.
Graphsearchcoverstwoclassesofqueries:(1)therstclassisbooleanqueries,i.
e.
,toanswer"yes"or"no",and(2)thesecondoneisfunctionalqueries,i.
e.
,toidentifyandreturnthematchingsubgraphs.
Itisobviousthatfunctionalqueriesmayneedtheaidofbooleanqueries.
Remarks.
Theabovedenitionofgraphsearchisquitegeneral,asdierentsemanticsof"match"leadtodierentgraphsearchqueries[7,8].
Most,ifnotall,commongraphqueriesbelongtographsearchqueries,suchasnodequeries(e.
g.
,neighborquery[14]),pathqueries(e.
g.
,reachability[32]andshortestpath[27])andsubgraphqueries(e.
g.
,graphhomomorphism[16],subgraphisomorphism[23],graphsimulation[11]anditsextensionstrongsimulation[18]).
2.
2AnEvolutionaryPointofViewAseriousquestionarisesnaturally:whydoweneedanothersearchparadigm–graphsearchWenextanswerthisquestionfromanevolutionarypointofview.
ConsidertheevolutionroadmapofinformationsearchshowninFig.
3.
Theemphasisofinformationsearchhasundergoneaseriousshift,i.
e.
,fromlesystems,todatabasesystems,totheWorldWideWeb,andtothemostrecentsocialnetworks:Filesystems.
Sincethe1960s,computershavebeenequippedwithmodernoperatingsystems[33].
Thelesysteminanoperatingsystemisanabstractiontostoreandorganizeasetofcomputerles,anditusuallysupportsuserstolookforspecicles,i.
e.
,simplesearchingfunctionalities.
Databasesystems.
Inthemid-1960s,databasesystemsbeganbeingappliedinbusiness,and,subsequently,relationaldatabasesplayedadominantrole.
Sincethelate1970s,theinventionofstructuredquerylanguage(SQL)hassignicantlypromotedtheuseofdatabases[34].
TheWeb.
Inthe1990s,searchengines,suchasGoogle,BingandYahoo!
,arewidelyusedduetotheblossomoftheWorldWideWeb.
Thesesearchenginesunanimouslyadoptedthesimplebutveryusefulapproach–keywordsearch,whichprovidespeoplewithaconvenientandeasywaytosearchspecicinformationontheWeb.
4ShuaiMaetal.
BigGraphSearch:ChallengesandTechniquesFigure4RelationalrepresentationSocialnetworks.
Fromtheendoflastcentury,withtherisingofWeb2.
0,socialnetworkshavemadesignicantinuencesonthesociety.
However,adominantsearchparadigmseemsmissinginsuchaneraofsocialcomputingandbigdata.
Astheaboveanalysisshows,animportantITinvention,e.
g.
,lesystems,databasesystemsandtheWeb,usuallytriggerstheemergenceofanovelsearchparadigm.
Weareessentiallyinasituationtolookforoneforsocialcomputingandsocialnetworks,andwebelievethatgraphsearchistheonellingthegap.
The"graphsearch"[35]and"knowledgegraph"[36]releasedbyFacebookandGoogle,respectively,shedlightonthis.
However,anotherquestionarises:whycouldnotwesimplyuseSQLorkeywordbasedsearch(1)Graphsearchvs.
SQLsearch.
SQLsearchisaverystrongsupportingtoolforsearchinginformationfromrelationaldatabasesystems.
However,itisnotappropriateforsearchinginformationfromgraphseventhoughgraphscouldbestoredusingrelations,duetoitsdisabilityandinconvenienceforansweringrecursivequeriessuchasgraphreachabilityandshortestpaths[37].
Indeed,forsimplegraphqueriesthatSQLsearchwoulddo,graphsearchcoulddoevenbetter.
Wenextillustratethiswithanexampletakenfrom[38],asimplesearchingcaseof"ndingthenamesofallofAlbertoPepe'sfriendsinasocialnetwork".
Case1:Socialnetworksarestoredusingrelations.
Therearetworelations:person(identier,name)forstoringaperson'suniedidentieranditsname,andfriend(person_a,person_b)forstoringthefriendshipofpersonswithidentiersperson_aandperson_b.
Inaddition,twoB+–treeindexesarebuiltoneachcolumnofthepersonrelation:theperson.
identierandperson.
nameindexes,andoneindexisbuiltontheperson_acolumnofthefriendrelation:thefriend.
person_aindex.
Weassumethatthereareintotalnpersonsandmfriendships.
TherelationalrepresentationispresentedinFig.
4.
TogetthenamesofAlbertoPepe'sfriends,threestepsareFigure5Graphrepresentationnecessary,asshowninthefollowing.
(a)Findtheuniqueidentiferof"AlbertoPepe"fromrelationperson,whichtakesO(log2n)timeusingtheperson.
nameindex.
(b)Findallthekidentiersofthefriendsof"AlbertoPepe"fromrelationfriendwiththeidentiferfoundin(a),whichtakesO(log2n+k)timeusingthefriend.
person_aindex.
(c)Findthekfriends'namesfromrelationpersonwiththekidentiersfoundin(b),whichtakesO(klog2n)timeusingtheperson.
identierindex.
Case2:Socialnetworksarestoredusingnaivegraphs.
ThepersonandfriendshipinformationcanbestoredasagraphasshowninFig.
5.
Eachpersoncanberepresentedasanodelabeledwiththeperson'snameanduniedidentier,andthefriendshipbetweentwopersonscanberepresentedasanedgebetweenthetwocorrespondingnodes.
AB+–treeindexisbuiltonthegraph,vertex.
nameindex,toquicklylocatethepositionofanodeinthegraphwithaperson'sname.
TogetthenamesofAlbertoPepe'sfriends,twostepsareneeded,asshownbelow.
(a)Identifythenodewithname"AlbertoPepe",whichtakesO(log2n)timeusingthevertex.
nameindex.
(b)Findthekfriendnodesofthenodefoundin(a)bytraversingitsadjacentneighboringnodesandgetthefriendnamesdirectlyintheknodelabels,whichtakesO(k+y)timesuchthatk+yisthetotalnumberoftheneighboringnodes.
ItisobviousthatthesearchingspeedisimprovedfromO((k+2)log2n)toO(log2n)whenusingthegraphrepresentation,insteadoftherelationalrepresentation.
Theimprovementiscrucialwhennisreallylarge,e.
g.
,whentherearebillionsofusers.
Ofcourse,onecouldaddredundantinformationtospeedupitseciency,whichresultsinextraspacecostinturn.
Hence,forbiggraphsearch,thegraphsearchapproachismuchsuperiortotheSQLsearchapproach.
Front.
Comput.
Sci.
5Figure6Statisticsofpapersongraphs(2)Graphsearchvs.
keywordsearch.
ThetraditionalkeywordbasedsearchingapproachismainlyforretrievinginformationfromtheWeb,whichisnotappropriateforsearchinginformationfromsocialnetworks.
TheinformationontheWebisusuallyisolatedandobject–objectweaktiedfromeachother,andmainlyabout"historicalandexisting"information,i.
e.
,whathappenedandhappening.
Socialcomputinggenerallytakesthesocialfactorsintoconsideration,suchasthesocialstructure,organizationandactivity,whichmakesrelationsadominantroleinsocialsearch.
Besides,socialdataareusuallyperson–personstrongrelatedorperson–objectstrongrelated.
Thismakesthefutureandrelationinformationparticularlyimportantforsocialsearch.
Underthesecircumstances,thekeywordbasedsearchingapproachescannotmeettherequirementsraisedbysocialcomputingandsocialnetworksnowadays.
Hencewearguethatgraphsearchisanewsearchingparadigmforsocialcomputinginthebigdataera.
Indeed,Facebookhasprovidedanewsearchingtechniquenamed"GraphSearch"[35],whichallowsuserstosearchforinformationusingsimplenaturallanguagesentences,e.
g.
,"RestaurantsinNewYorkthatmyfriendslike","PhotostakeninHawaiiofmyfriends"and"Nationalparkswheremyfriendshavebeento".
Besides,thedevelopmentofsocialnetworkshasalsopromotedtheurgentneedofanewsearchengineinturn.
2.
3JointEortsoftheIndustryandAcademiaRecently,wehaveconductedasurveyonthenumberofpapersongraphspublishedinthetopthreeinuentialdatabasesystemconferences(SIGMOD,VLDBandICDE)eversince2000.
TheresultisshowninFig.
6,fromwhichwehavefoundthat:(a)fromaround2000(theemergenceofWeb2.
0),researchersbegantofocusonthestudyofgraphs,(b)thenumberofpapersongraphshasbeenincreasingcontinuouslysincethen,(c)from2008,graphshavebeenaFigure7TheFAErulehottopicintheeldofdatabaseresearch,and(d)thereisaburstofthenumberofpapersongraphsin2014.
Manywell-knownresearchinstitutionsandcompanieshavebeenconcentratingontheresearchandapplicationsofgraphs.
Forexample,Microsoft'sTrinity[39]projectand"Horton-QueryingLargeDistributedGraphs"[40]projectfordatacenter;large-scalegraphprocessingsystemPregel[41]ofGoogle;"KnowledgeAcquisitionandManagement"[42]projectofYahoo!
;Neo4j'sopen-sourcegraphdatabase[43];"GraphSearch"ofFacebook[35];andtheresearchteamsfromacademiasuchastheUniversityofCaliforniaSantaBarbara,UniversityofEdinburgh,UniversityofNewSouthWales,ChineseUniversityofHongKong,andBeihangUniversity.
Thejointinterestsandeortsfromboththeindustryandacademiaprovidemoreevidencesonthepowerandimportanceofgraphsearch.
3ChallengesofBigGraphSearchInthissection,werstintroducetheFAErulethatisimpor-tantforasearchengine,andwethenpointoutitsdicultiesandchallengesforbiggraphsearch.
3.
1TheFAERuleTheFAErulesaysthatthequalityofsearchenginesinvolveswiththreekeyfactors:friendliness,accuracyandeciency,asillustratedinFig.
7,andthatagoodsearchenginemustprovidetheuserswithafriendlyqueryinterfaceandhighlyaccurateanswersinafastway.
(1)Friendliness.
Itisnecessaryforasearchenginetoprovidetheuserswithafriendlyqueryinterfacesuchthattheuserscouldconvenientlyspecifytheirsearchingconditionswithsmalleorts.
Generallyspeaking,thekeywordsearchontheWebonlyrequiresuserstoenterseveralkeywords,whichisveryuser-friendly.
However,itcannotallowuserstospecify6ShuaiMaetal.
BigGraphSearch:ChallengesandTechniquescomplexsearchconditionslikegraphs(suchasrelationshipsamongkeywords),anditonlyreturnstheWebhyperlinksthatmightcontainanswerstousers.
Hence,thissimplenessalsobringsthegapbetweenwhattheuserswantandwhattheusersget.
Incontrast,theresultsofgraphsearcharemuchmoreaccurateasitallowsuserstofurtherspecifystructuralconstraintsbydesigningvariouspatterngraphs.
However,itisdenitelyinconvenientforuserstoenterpatterngraphsasinputsevenforsmallpatterngraphs,asitishardfornon-professionaluserswhoarenotfamiliarwiththecomplexdatagraphstospecifyprecisepatterngraphs.
Peoplearealreadymakinganeortfordesigningfriendlygraphsearchinterfaces.
ThetechniquedevelopedbyFacebookallowsuserstospecifypatterngraphswithsimplenaturallanguagesentences,aswementionedearlier.
AndYangetal.
[44]haverecentlyproposedanovelgraphsearchsystemenablingschemalessandstructurelessgraphquerying,which(a)providesauser-friendlyinterfacewhereuserscangiveroughdescriptivepatterngraphsasqueries,and(b)supportsvariouskindsoftransformationssuchassynonym,abbreviation,andontology.
However,acompletelyfriendlyinterfacemeetingtherequirementsofpracticalapplicationsisstillonitsroadforbiggraphsearch.
(2)Accuracy.
Itisnecessaryforasearchenginetoprovidetheuserswithaccurateanswers.
Whenausersubmitsaquerytoasearchengine,whichrepresentstheuser'ssearchinggoal,thesearchengineanalyzestheuser'sinputandtriestounderstandwhattheuserwants.
Hence,toreachhighsearchingaccuracy,itisindispensabletounderstandtheusers'realintentsforsearchengines.
However,itisprettycommonthatthereisagapbetweenwhatauserwantsandwhatshe/hegetsbackfromasearchengine.
Thisisbecauseitisaverychallengingtasktounderstandandspecifytheusers'intentsinawaysuchthatamachinecouldeasilyunderstand.
Forexample,whenausersubmits"apple"toasearchengine,itishardtodistinguishthefruitapplefromtheproductsofAppleInc.
.
Commonapproaches[45,46]focusonqueryclassication.
Givenaquery,theseapproachestrytoclassifythequerytosomepredenedclasses.
Recently,someresearcherstakeintoaccountofthedierenceofindividualsandattempttoanalyzetheintentsofusersbyincorporatingtheirsearchbehaviorsandpreferences[47,48].
Knowledgealsoplaysanimportantroletounderstandtheuserintentandtoimprovethesearchingaccuracy.
Forexample,knowledgegraphmakesGooglesearchenginemoreintelligenttounderstandthesearchingintentsofusers.
Whenhavingthekeyword"apple"intoGooglesearchengine,itwillprovidetwoextrapanelsinadditiontoalistofWebhyperlinks,oneforAppleInc.
andtheotherfortheapplefruit.
Thenuserscanclickonetoenlargeandgetdetailedinformationbasedontheirintents,whichallowsuserstogetmorerelevantresultswithouthavingtovisitotherWebsitestojudgewhethertheinformationarerelevantbythemselves.
ThisisbecauseGooglenowisabletounderstandthedierenceamongtheseentities,andthenuanceintheirmeanings,withtheaidofKnowledgeGraph[49].
(3)Eciency.
Howtosearchinformationinafastwayisakeyforthesuccessofasearchengine.
Itisalsoafundamentalproblemindatabaseandinformationretrievalareas,especiallywhenwearedealingwithbiggraphstoday.
WewillintroduceseveralsearchingtechniquesforbiggraphsindetailinthecomingSection4.
3.
2TheChallengesTheexpressivenessofgraphsnaturallycomeswithmorediculties,andtheemergingsocialapplicationsraisemorechallengestosearchandmanagebiggraphs.
Accordingtostatistics,forFacebook,thereareover1.
3billionmonthlyactiveusers;forevery20minutes,thereare1millionlinksshared,2millionfriendrequestsgenerated,and3millionmessagessent[2];similarlyforTwitter,thereareover0.
6billionusers;everysecondthereare9100tweetshappened;andpeoplequerytwittersearchengine2.
1billiontimeseveryday[50].
Thesestatisticsindicatethefollowing.
(a)Graphdatahavereachedhundredmillionsordersofmagnitude[51];(b)Graphdataareupdatedallthetime,andtheupdateamountdailyreacheshundredthousandsordersofmagnitude[52];(c)Similartotraditionalrelationaldata[53,54],graphdatahavethedatauncertaintyproblemduetotheexternalreasoncausedbydatasamplinganddatamissingandtheinternalreasoncausedbythedynamicchangesingraphdata;And,evenworse,(d)graphdataaremuchmorecomplexthanrelationalandXMLdata.
Insummary,graphdatahavefourkeyfeatures:big,dynamic,uncertainandcomplex[8].
Therstfeaturerequiresthatgraphsearchneedstostrikeabalancebetweenitstimeandspacecost.
Whenthegraphistoolargetobeprocessedonsinglemachines,itisalsonecessarytodesignecientandeectivedistributedalgorithms.
Thesecondfeaturerequiresthatgraphsearchshouldtakedynamicchangesandtemporalfactorsintoconsideration.
ThelasttwofeaturesrequirethatgraphFront.
Comput.
Sci.
7searchshoulddesignreasonablemodelstocaptureuncertaintiesingraphdata,andhighlyecientalgorithmstoanswergraphsearchqueriesonuncertaingraphs.
Thesetogethermakeitanextremelychallengingtasktodevelopabiggraphsearchenginewithafriendlyqueryinterface,accurateanswersandhigheciency.
4TechniquesTowardsBigGraphSearchAfundamentalissueinthebigdataeraistheeciency.
Inthissection,wepresentthreeclassesoftechniquesforbiggraphsearch:querytechniques,datatechniquesanddistributedcomputingtechniques.
4.
1QueryTechniquesWerstintroducetwoquerytechniques:queryapproximationandincrementalcomputation.
(1)Queryapproximation.
ThecoreideaofqueryapproximationistotransformaclassofqueriesQwithhighercomputationalcomplexityintoanotherclassofqueriesQ′withlowercomputationalcomplexityandsatisableapproximateanswers,asdepictedinFig.
8inwhichQ,Q′andDdenotetheoriginalquery,approximatequeryanddata,respectively.
Themajorchallengecomesfromtheneedofabalancebetweenthequeryeciencyandansweraccuracy.
Figure8QueryapproximationWenextexplainthequeryapproximationtechniqueusingstrongsimulation,anewgraphpatternmatchingmodelproposedin[18,19].
Graphpatternmatchingistondallmatchedsubgraphsinadatagraphforagivenpatterngraph,anditisoftendenedintermsofsubgraphisomorphism.
Thegoodnessofsubgraphisomorphismisthatallmatchedsubgraphsareexactlythesameasthepatterngraph,i.
e.
,completelypreservingthetopologystructurebetweenthepatterngraphanddatagraph.
Subgraphisomorphismis,however,np-complete[23],andmayreturnexponentialmanymatchedsubgraphs.
Recentevidenceshaveshownthatsubgraphisomorphismistoorestrictivetondsensiblematchesincertainscenarios[11].
Thesehindertheusabilityofgraphpatternmatchinginemergingapplications.
Tolowerthehighcomplexityofsubgraphisomorphism,variousextensionsofgraphsimulation[55]havebeenconsideredinsteadin[11,32].
Theseextensionsallowgraphpatternmatchingtobeconductedincubic-time.
However,theyfallshortofcapturingthetopologyofdatagraphs,i.
e.
,graphsmayhaveastructuredrasticallydierentfrompatterngraphstheymatch,andthematchesfoundareoftentoolargetoanalyze.
Torectifytheseproblems,strongsimulation,arevisionofgraphsimulation,wasproposedforgraphpatternmatching,suchthatstrongsimulation(a)preservesthetopologyofpatterngraphsandndsaboundednumberofmatches,(b)retainsthesamecomplexityasearlierextensionsofgraphsimulation[11,32],byprovidingacubic-timealgorithmforcomputingstrongsimulation,and(c)hasthelocalitypropertythatallowsustodevelopaneectivedistributedalgorithmtoconductgraphpatternmatchingondistributedgraphs[18,19].
(2)Incrementalcomputation.
Whentherearedataupdates,queryanswerstypicallyneedtobere-computedtoreectthechanges.
Inpractice,bigdatagraphsarefrequentlymodied,aswepointedoutinSection2,anditistoocostlytorecomputematchesfromscratcheverytimewhenthedatagraphsareupdated.
Incrementalcomputationisatechniquethatattemptstoreducetimebyreusingpreviouscomputingeortsandonlycomputingthoseanswersthat"dependon"thechangeddata,anditisdepictedinFig.
9,inwhichQ,Danddenotethequery,originaldataanditsupdates,respectively.
Figure9IncrementalcomputationItisworthmentioningthatincrementalalgorithmshavebeendevelopedforvariousapplications(see[56]forasurvey).
ThomasW.
Repshasdonepioneeringworkonthestudyofincrementalcomputation[56,57],andheobservedthatthecomplexityofincrementalalgorithmswasmoreaccuratelycharacterizedintermsofthesizeoftheareaaectedbytheupdates,ratherthanthesizeoftheentireinput[57].
Nextlet'staketheindexingofGooglesearchasanexample.
ItisknownthattheWebdocumentsarecrawledandstoredinalargerepository,andarepre-indexedtospeedupthesearcheciencyandimprovetheuserexperiences.
Theindexingprocessincursaheavyworkload,andGoogle8ShuaiMaetal.
BigGraphSearch:ChallengesandTechniquesinitiallyadoptedsomebatch-processingapproachessuchasMapReduce[58]toimprovetheeciency,whichisnotsatisfactorywhenfacingwithconstantchanges.
GooglelaterondevelopedPercolator[59],asystemincrementallyprocessingupdatesonlargedatasets.
Thatis,Googlehasconverteditsbatch-basedindexingsystemintoanincrementalindexingsystem.
ItwasreportedthatcomparedwithMapReduce,Percolator(a)reducedtheaveragedocumentprocessinglatencybyafactorof100,and(b)reducedtheaverageageofresultingdocumentsofGooglesearchby50%whenprocessingthesameamountofdocumentsperday[59].
4.
2DataTechniquesOnekeyfeatureofbigdatagraphsisthelargevolume,and,hence,thespacecomplexity[60]ofgraphsearchstartsraisingmoretroubles.
Hereweintroducevetechniquestoboostthesearcheciencyfromthedatapointofview:dataapproximation,datasampling,datapartitioning,datacompressionanddataindexing.
(1)Dataapproximation.
ThecoreideaofdataapproximationisthatgivenaclassofqueriesQandadatasetD,ittransformsDintoasmallerdatasetD′suchthatQonD′returnsasatisableapproximateanswerinamoreecientway,asdepictedinFig.
10.
Similartoqueryapproximation,themajorchallengeofdataapproximationcomesfromtheneedofabalancebetweenthequeryeciencyandansweraccuracy.
Figure10DataapproximationWehaveadoptedtheideaintheprocessofdealingwithlargegraphsinthestudyofanomalydetectioningraphstreams,whendealingwiththematrixrepresentationofasocialgraph,andwehaveboththeoreticallyandexperimentallyshownthatsimplifyingthematrixbyreplacingapartofsmallentryvalueswithzerohasfewaectsonthecomputationofeigenvectors[61].
(2)Datasampling.
Samplingisconcernedwiththeselectionofasubsetofdatafromalargedataset.
InsteadofdealingwiththeentiredatasetDforaqueryQ,thedatasamplingtechniquereducesthesizeofthedatasetDbysampling,withapermissionoflossofaccuracytosomeextentinthequeryresult[62].
Inasamplingprocess,itmustbeensuredthatthesampleddataobtainedmustreectthecharacteristicsandinformationoftheoriginaldataD,asdepictedinFig.
11.
Figure11DatasamplingItisworthmentioningthatMichaelI.
Jordanandhiscol-leagueshaveproposedanewsamplingapproach–bootstrap–todealingwithbigdata[63,64].
(3)Datapartitioning.
Datapartitioningisaneectivemethodtoexecutequeriesonlarge-scaledatasetsinadivide-and-conquerway.
ItpartitionsadatasetDintoasetofrelativelysmalldatasetsD1,DnsuchthatD=D1Dn.
Ideally,thenalqueryanswerisassembledusingthenanswersonthesetofsmalldatasets,andtheanalysisspeedcanbeimprovedsignicantly.
TheentireprocessisdepictedinFig.
12.
Figure12DatapartitioningItisworthmentioningthatgraphpartitioninghasbeenextensivelystudiedsincethe1970's[65–67],andhasbeensuccessfullyusedinvariousapplications,e.
g.
,circuitplacement,parallelcomputingandscienticsimulation[67].
Thegraphpartitioningproblemisingeneralahardproblemandisoftennp-complete[66].
(4)Datacompression.
Theprincipleofdatacompressionisthatcompressingbyremovingredundanciesalsoanswersthesamequestion.
Therearemanyknowndatacompressionmethodsthataresuitablefordierenttypesofdata,andproducedierentanswers,buttheyareallbasedontheprinciple,namelycompressingdatabyremovingredundanciesfromtheoriginaldata(see[68]foracompletereference).
Thebenetsofdatacompressionlieinthatitprovidesmorepossibilitiestoworkinmainmemoryandpotentialstoworkeciently.
Figure13DatacompressionDierentfromdatasampling,datacompressiongeneratesasmalldatasetD′fromtheoriginaldatasetDbyremovingredundanciesandpreservingtheinformationonlyrelevanttoqueries,asdepictedinFig.
13.
Inaddition,thereareusuallynorestrictionsontheformatsofthecompresseddata,whileFront.
Comput.
Sci.
9datasamplingnormallykeepstheoriginaldataformats.
Thereisawholebunchofworkon(lossyorlossless)graphcompression[14,69–71].
As[72–74]show,somegraphalgorithmscanbespeededupbyoperatingoncompressedgraphsdirectly,whichcanbetreatedasqueryorientedcompression,andneedstoinvestmoreeortstostudy.
(5)Dataindexing.
Anindexisadatastructurethatimprovesthespeedofqueriesbyreducingsearchspace,atthecostofupdatemaintenanceandextrastorage.
Indexesarecommonlyusedforqueryingrelationaldatabases[34]andinformationretrievalofsearchengines[75].
Whendatagraphsarerelativelylarge,graphindexingtechniquecanquicklyprunedatagraphsthatobviouslymismatchthepatterngraph[76].
Therealreadyexistindexingmethodsfor(variouskindsof)graphpatternmatching[62].
Therearemainlythreemetricsformeasuringwhetheranestablishedindexisappropriate:thespacecost,buildingtimeandquerytime.
Thesmallerthespaceofanindexis,thelessadditionalstorageburdenincurred.
Thebuildingtimerepresentsthetimecostofcreatingtheindex,andthequerytimeindicatesthetimecostforthequeryprocess.
Whendatagraphsarechangedovertime,theindexrefreshspeedreectsitsadaptivenesstodynamicchanges.
4.
3BeyondQueryandDataTechniquesWenallyintroducethedistributedcomputingtechnique,asanexamplethatutilizestheabovequeryanddatatechniquesandbeyond.
Distributedcomputingreferstotheuseofdistributedsystemstosolveproblemssuchthataproblemisdividedintomanytasks,eachofwhichiscomputedononeormoremachines,andwhichcommunicatewitheachotherbymessagepassing[77,78].
DistributedcomputingtypicallyneedstopartitionadatasetDintorelativelysmalldatasetsD1,Dn,anddistributesthemonmultiplecomputingmachines,asdepictedinFig.
14.
Figure14DistributedcomputingItisknownthatreal-lifegraphsaretypicallywaytoolarge,e.
g.
,theWebgraphofYahoo!
hasabout14billionnodes,andthereareover1.
3billionusersonFacebook.
Hence,itisnotpracticaltohandlelargegraphsonsinglemachines.
Moreover,real-lifegraphsarenaturallydistributed,e.
g.
,Google,Yahoo!
andFacebookhavelarge-scaledistributeddatacenters.
Thissaysthatdistributedcomputingisinevitablefacingwithbiggraphs.
Wehavedevelopedacomputationmodelforalargeclassofdistributedalgorithmsforgraphsimulation[79].
Themodelconsistsofaclusterofidenticalmachines,inwhichoneactsasacoordinator.
Eachmachinecandirectlysendanarbitrarynumberofmessagestoanother,andallmachinesco-workwitheachotherbylocalcomputationsandmessage-passing.
Further,wealsoidentifythreecomplexitymeasuresontheperformanceofdistributedalgorithmsrelatedtothecomputationmodelabove:(a)visittimes,whichisthemaximumvisitingtimesofamachine,indicatesthecomplexityofinteractions;(b)makespan,whichistheevaluationofthetotalcomputationtime,isameasureofeciency;(c)datashipment,whichisthesizeofthetotalmessagesshippedamongdistinctmachinesduringthecomputation,indicatesthenetworkbandwidthconsumption.
However,thesethreemeasuresaretypicallycontroversialwitheachother,andhowtoachieveabalancedstrategyisagreatchallengefordesigningdistributedalgorithms.
Recently,manydistributedgraphprocessingsystemshavebeendeveloped,whichbasicallyfallintotwocategories:onemakesuseofMapReduce[58]orSpark[80]tospeed-upbiggraphprocessing[81–83],andtheotherusesdierentdistributedcomputingmodels,suchasPregel[41],GraphLab[84]andPowerGraph[85].
Remarks.
Thereexistnosingletechniquesthatcouldtallforbiggraphsearch.
Thatis,itisoftennecessarytocombinedierenttechniquestoobtainsatisablesolutions.
Wealsoencourageinterestedreaderstoreadaveryrecentarticle[86]fordiscussionsonthetheoryandtechniquesofbigdata,acomplementofthisarticle.
5ConclusionsInthisarticlewehaveinvestigatedbiggraphsearch,anovelpromisingsearchparadigmforsocialcomputinginthebigdataera.
First,wehaveanalyzedtheneedofbiggraphsearchwithvariousapplications,industrialandacademicdevelopments,andtheevolutionhistoryofinformationsearchingparadigms.
Second,wehavepointedoutthechallengesandopportunitiesofbiggraphsearch.
Finally,wehaveintroducedthreetypesoftechniquestowardsbiggraphsearch:querytechniques,datatechniquesanddistributedcomputingtechniques.
10ShuaiMaetal.
BigGraphSearch:ChallengesandTechniquesBeinganewparadigmforsocialcomputing,biggraphsearchhasreceivedextensiveattentions.
However,thereisobviouslyalongwaytogoforabiggraphsearchenginethatmeetsvariousneedsinpractice.
AcknowledgementsThisworkissupportedinpartby973program(No.
2014CB340300),NSFC(No.
61322207)andtheFundamentalResearchFundsfortheCentralUniversities.
References1.
http://www.
itu.
int/en/ITU-D/statistics2.
http://www.
statisticbrain.
com/facebook-statistics/3.
Data,dataeverywhere.
TheEconomist,Feb27th,20104.
http://www.
facebook.
com5.
http://twitter.
com6.
http://weibo.
com/7.
MaS,LiJ,LiuX,HuaiJ.
Graphsearch:Anewsearchingapproachtothesocialcomputingera.
CommunicationsofCCF,2012,8(11):26–318.
MaS,CaoY,WoT,HuaiJ.
Socialnetworksandgraphmatching.
CommunicationsofCCF,2012,8(4):20–249.
MaS,LJ,LiuX,HuaiJ.
Graphsearchinthebigdataera.
InformationandCommunicationsTechnologies,2013,6:44–5110.
TianY,PatelJM.
Tale:Atoolforapproximatelargegraphmatching.
In:ICDE.
200811.
FanW,LiJ,MaS,TangN,WuY,WuY.
Graphpatternmatching:Fromintractabletopolynomialtime.
PVLDB,2010,3(1)12.
BarceloP,HurtadoCA,LibkinL,WoodPT.
Expressivelanguagesforpathqueriesovergraph-structureddata.
In:PODS.
201013.
FengK,CongG,BhowmickSS,MaS.
Insearchofinuentialeventorganizersinonlinesocialnetworks.
In:SIGMOD.
201414.
MaserratH,PeiJ.
Neighborqueryfriendlycompressionofsocialnet-works.
In:KDD.
201015.
SchenkerA,LastM,BunkeH,KandelA.
Classicationofwebdocu-mentsusinggraphmatching.
IJPRAI,2004,18(3):475–49616.
FanW,LiJ,MaS,WangH,WuY.
Graphhomomorphismrevisitedforgraphmatching.
PVLDB,2010,3(1)17.
TerveenLG,McDonaldDW.
Socialmatching:Aframeworkandresearchagenda.
In:ACMTrans.
Comput.
-Hum.
Interact.
2005,401–43418.
MaS,CaoY,FanW,HuaiJ,WoT.
Capturingtopologyingraphpatternmatching.
PVLDB,2011,5(4):310–32119.
MaS,CaoY,FanW,HuaiJ,WoT.
Strongsimulation:Capturingtopologyingraphpatternmatching.
ACMTrans.
DatabaseSyst.
,2014,39(1)20.
EckersonWW.
Dataqualityandthebottomline:Achievingbusi-nesssuccessthroughacommitmenttohighqualitydata.
InTheDataWarehousingInstitute,200221.
OttoB,Weber.
K.
Fromhealthcheckstothesevensisters:Thedataqualityjourneyatbt.
In:BTTR-BEHSG/CCCDQ/8.
Sept.
200922.
FanW,LiJ,MaS,TangN,YuW.
Interactionbetweenrecordmatchinganddatarepairing.
In:SIGMOD.
201123.
UllmannJR.
Analgorithmforsubgraphisomorphism.
J.
ACM,1976,23(1):31–4224.
LiuC,ChenC,HanJ,YuPS.
Gplag:detectionofsoftwareplagiarismbyprogramdependencegraphanalysis.
In:KDD.
200625.
FerranteJ,OttensteinKJ,WarrenJD.
Theprogramdependencegraphanditsuseinoptimization.
ACMTrans.
Program.
Lang.
Syst.
,1987,9(3):319–34926.
RiceMN,TsotrasVJ.
Graphindexingofroadnetworksforshortestpathquerieswithlabelrestrictions.
PVLDB,2010,4(2):69–8027.
CormenTH,LeisersonCE,RivestRL,SteinC.
IntroductiontoAlgorithms.
TheMITPress,200128.
ChenZ,ShenHT,ZhouX,YuJX.
Monitoringpathnearestneighborinroadnetworks.
In:SIGMOD.
200929.
ChowdhuryNMMK,RahmanMR,BoutabaR.
Virtualnetworkembeddingwithcoordinatednodeandlinkmapping.
In:INFOCOM.
200930.
ConteD,FoggiaP,SansoneC,VentoM.
Thirtyyearsofgraphmatch-inginpatternrecognition.
IJPRAI,2004,18(3):265–29831.
KarypisG,AggarwalR,KumarV,ShekharS.
Multilevelhypergraphpartitioning:applicationsinvlsidomain.
IEEETrans.
VLSISyst.
,1999,7(1):69–7932.
FanW,LiJ,MaS,TangN,WuY.
Addingregularexpressionstographreachabilityandpatternqueries.
In:ICDE.
201133.
Hansen,BrinchP.
ClassicOperatingSystems.
Springer,200134.
RamakrishnanR,GehrkeJ.
DatabaseManagementSystems.
McGraw-HillHigherEducation,200035.
https://www.
facebook.
com/about/graphsearch36.
http://www.
google.
com/insidesearch/features/search/knowledge.
html37.
AbiteboulS,HullR,VianuV.
FoundationsofDatabases.
Addison-Wesley,199538.
SakrS,PardedeE,eds.
GraphDataManagement:TechniquesandApplications.
IGIGlobal,201139.
http://research.
microsoft.
com/en-us/projects/trinity40.
http://research.
microsoft.
com/en-us/projects/ldg41.
MalewiczG,AusternMH,BikAJC,DehnertJC,HornI,LeiserN,CzajkowskiG.
Pregel:asystemforlarge-scalegraphprocessing.
In:SIGMOD.
201042.
http://research.
yahoo.
com/project/43.
http://neo4j.
org44.
YangS,WuY,SunH,YanX.
Schemalessandstructurelessgraphquerying.
PVLDB,2014,7(7):565–57645.
BeitzelSM,JensenEC,FriederO,LewisDD,ChowdhuryA,KolczA.
Improvingautomaticqueryclassicationviasemi-supervisedlearn-ing.
In:ICDM.
200546.
ShenD,SunJT,YangQ,ChenZ.
Buildingbridgesforwebqueryclassication.
In:SIGIR.
200647.
XingQ,LiuY,NieJY,MaMZS,ZhangK.
Incorporatinguserpreferencesintoclickmodels.
In:CIKM.
201348.
HuB,ZhangY,ChenW,WangG,,YangQ.
Characterizingsearchintentdiversityintoclickmodels.
In:WWW.
2011Front.
Comput.
Sci.
1149.
http://googleblog.
blogspot.
com/2012/05/introducing-knowledge-graph-things-not.
html50.
http://www.
statisticbrain.
com/twitter-statistics51.
GiatsoglouM,PapadopoulosS,VakaliA.
Massivegraphmanagementforthewebandweb2.
0.
In:NewDirectionsinWebDataManagement1.
Springer,201152.
NewmanM,BarablcsiAL,WattsDJ.
TheStructureandDynamicsofNetworks.
PrincetonUniversityPress,200653.
RahmE,DoHH.
Datacleaning:Problemsandcurrentapproaches.
IEEEDataEngineeringBulletin,2000,23(4):3–1354.
FanW,LiJ,MaS,TangN,YuW.
Towardscertainxeswitheditingrulesandmasterdata.
VLDBJ.
,2012,21(2):213–23855.
HenzingerMR,HenzingerTA,KopkePW.
Computingsimulationsonniteandinnitegraphs.
In:FOCS.
199556.
RamalingamG,RepsTW.
Acategorizedbibliographyonincrementalcomputation.
In:POPL.
199357.
RamalingamG,RepsT.
Onthecomputationalcomplexityofdynamicgraphproblems.
TCS,1996,158(1-2)58.
DeanJ,GhemawatS.
Mapreduce:Simplieddataprocessingonlargeclusters.
In:OSDI.
200459.
PengD,DabekF.
Large-scaleincrementalprocessingusingdistributedtransactionsandnotications.
In:OSDI.
201060.
PapadimitriouCH.
ComputationalComplexity.
Addison-Wesley,199461.
YuW,AggarwalCC,MaS,WangH.
Onanomaloushotspotdiscoveryingraphstreams.
In:ICDM.
201362.
AggarwalCC,WangH.
ManagingandMiningGraphData.
Springer,201063.
JordanMI.
Divide-and-conquerandstatisticalinferenceforbigdata.
In:KDD.
201264.
KleinerA,TalwalkarA,SarkarP,JordanMI.
Thebigdatabootstrap.
In:ICML.
201265.
KernighanBW,LinS.
Anecientheuristicprocedureforpartitioninggraphs.
BellSystemTechnicalJournal,1970,49(1):13–2166.
KarypisG,KumarV.
Afastandhighqualitymultilevelschemeforpartitioningirregulargraphs.
SISC,1998,20(1):359–39267.
YangS,YanX,ZongB,KhanA.
Towardseectivepartitionmanage-mentforlargegraphs.
In:SIGMOD.
201268.
SalomonD.
Datacompression-TheCompleteReference,4thEdition.
Springer,200769.
BuehrerG,ChellapillaK.
AscalablepatternminingapproachtoWebgraphcompressionwithcommunities.
In:WSDM.
200870.
AdlerM,MitzenmacherM.
TowardscompressingWebgraphs.
In:DCC.
200171.
BoldiP,VignaS.
TheWebGraphframeworkI:Compressiontech-niques.
In:WWW.
200472.
FederT,MotwaniR.
Cliquepartitions,graphcompressionandspeeding-upalgorithms.
J.
Comput.
Syst.
Sci.
,1995,51(2):261–27273.
KarandeC,ChellapillaK,AndersenR.
SpeedingupalgorithmsoncompressedWebgraphs.
In:WSDM.
200974.
FanW,LiJ,WangX,WuY.
Querypreservinggraphcompression.
In:SIGMOD.
201275.
Baeza-YatesRA,Ribeiro-NetoBA.
ModernInformationRetrieval-theconceptsandtechnologybehindsearch,Secondedition.
PearsonEducationLtd.
,Harlow,England,201176.
KleinK,KriegeN,MutzelP.
CT-Index:Fingerprint-basedgraphin-dexingcombiningcyclesandtrees.
In:ICDE.
201177.
LynchNA.
DistributedAlgorithms.
MorganKaufmann,199678.
PelegD.
DistributedComputingALocality-SensitiveApproach.
SIAM,200079.
MaS,CaoY,HuaiJ,WoT.
Distributedgraphpatternmatching.
In:WWW.
201280.
ZahariaM,ChowdhuryM,DasT,DaveA,MaJ,McCaulyM,FranklinMJ,ShenkerS,StoicaI.
Resilientdistributeddatasets:Afault-tolerantabstractionforin-memoryclustercomputing.
In:NSDI.
201281.
GaoJ,ZhouJ,ZhouC,YuJX.
Glog:Ahighlevelgraphanalysissystemusingmapreduce.
In:ICDE.
201482.
QinL,YuJX,ChangL,ChengH,ZhangC,LinX.
Scalablebiggraphprocessinginmapreduce.
In:SIGMOD.
201483.
XinRS,GonzalezJE,FranklinMJ,StoicaI.
Graphx:aresilientdistributedgraphsystemonspark.
In:GRADES.
201384.
LowY,GonzalezJ,KyrolaA,BicksonD,GuestrinC,HellersteinJM.
Distributedgraphlab:Aframeworkformachinelearninginthecloud.
PVLDB,2012,5(8):716–72785.
GonzalezJE,LowY,GuH,BicksonD,GuestrinC.
Powergraph:Distributedgraph-parallelcomputationonnaturalgraphs.
In:OSDI.
201286.
FanW,HuaiJ.
Queryingbigdata:Bridgingtheoryandpractice.
J.
Comput.
Sci.
Technol.
,2014,29(5):849–869

展开全文

namesgraph相关文档

resultsgraphsearch

reciprocatedgraphsearch

atmosphereios11 浙江世纪华通集团股份有限公司支持ipad 支持ipad 支持ipad ipadwifiipad插卡版和wifi版有什么区别，价格差的多么？itunes备份itunes就是备份不了怎么办啊联通iphone4iphone4想换联通的卡是普通联通的卡都能开通3G么还是得换联通3G卡联通都有什么套餐我是北京的 csshackcss常见的hack方法有哪些 chromeframe我的Chrome Frame为什么不能使用? 美国vps租用 vps代理免费申请域名和空间购买域名和空间 yardvps cpanel主机 wdcp sub-process java空间中国电信测速112 阿里云浏览器空间出租谁的qq空间最好看徐正曦网络空间租赁 hkt 空间登录首页带宽租赁韩国代理ip wordpress中文主题更多

namesgraph

Digital-VM80美元新加坡和日本独立服务器

georgedatacenter39美元/月$20/年/洛杉矶独立服务器美国VPS/可选洛杉矶/芝加哥/纽约/达拉斯机房/

酷番云78元台湾精品CN2 2核 1G 60G SSD硬盘