reciprocatedgraphsearch

graphsearch  时间:2021-05-25  阅读:()
ICWSM'2007Boulder,Colorado,USAStructuralLinkAnalysisfromUserProfilesandFriendsNetworks:AFeatureConstructionApproachWilliamH.
HsuJosephLancasterMartinS.
R.
ParadesiTimWeningerDepartmentofComputingandInformationSciences,KansasStateUniversity234NicholsHallManhattan,KS66506-2302+17855326350{bhsu|joseph|pmsr|weninger}@ksu.
eduAbstractWeconsidertheproblemsofpredicting,classifying,andannotatingfriendsrelationsinfriendsnetworks,baseduponnetworkstructureanduserprofiledata.
First,wedocumentadatamodelfortheblogserviceLiveJournal,anddefineasetofmachinelearningproblemssuchaspredictingexistinglinksandestimatinginter-pairdistance.
Next,weexplainhowtheproblemofclassifyingauserpairinasocialnetwork,asdirectlyconnectedornot,posestheproblemofselectingandconstructingrelevantfeatures.
Wedocumentfeatureanalyzersforattributesthatdependonlyongraphattributes,thosethatdependonindividualuserdemographicsandset-valuedattributes(e.
g.
,interests,communities,andeducationalinstitutions),andthosethatdependoncandidateuserpairs.
Wethenextendourdatamodelusingwhole-networkattributesandreportmachinelearningexperimentsonlearningtheconceptofaconnectedpairoffriendsfromLiveJournaldata.
Finally,wedevelopatheoryofdependenttypesforderivingcausalexplanationsanddiscusshowthiscanbeusedtoscalestatisticalrelationallearninguptoourfullcorpus,arecentcrawlofoveramillionrecordsfromLiveJournal.
GeneralTermsAlgorithms,ExperimentationKeywordsdatamining,linkanalysis,machinelearning,socialnetworkanalysis,userprofiling.
1.
IntroductionAnalysisoffriendsnetworksprovidesabasisforunderstandingthewebofinfluence[Ko01]insocialmedia.
Inparticular,theproblemsofdeterminingtheexistenceoflinksandofclassifyingandannotatingknownlinksarefirststepstowardidentifyingpotentialrelationships.
Thisinferredinformationcaninturnbeusedtointroducenewpotentialfriendstooneanother,makebasicrecommendationssuchascommunityrecruitsormoderatorcandidates,oridentifywholecliquesandcommunities.
Inthispaper,weconsidertheproblemofdiscoveringlinksinanincompletegraph.
Wepresentanapproachtolinkpredictionthatisbasedongraphfeatureanalysisandintrinsicattributesofentities(usersandcommunities).
Wereportsomepromisingpreliminaryresultsonradius-limitedneighborhoodsofthebloggingserviceLiveJournalanddiscusstheresultsofexploratoryexperimentsthatpointtowardaneedtodifferentiatethetypesoffeaturesinafriendsnetwork,namely:1.
thosethatdependonthedemographicsoftheentirenetwork2.
thosethatarecomputableforeachuseroreachpairofuser3.
thosethatdependontheexistenceofareported,inferred,orsuspectedlinkWederivesomesuchfeaturesanddiscussthecostsofcomputing,selecting,andrecombiningthem.
Ofparticularinterestinthedomainofcommercialweblogsandsocialmediaaredemographicfeaturesrelevanttocollaborativerecommendationofgoodsandformationofbrandingcommunities.
Thestructuraldependenceandcontext-specificdependenceoffeaturesdetermineswhatnewfeaturesarefeasibletoconstruct,bothintermsofstatisticalsufficiencyandcomputationalcomplexity.
Inconclusion,weexaminesomenewfeaturesthatwerederivedbyhand,discussthealgorithmsusedtocomputethem,andrelatethesespecificalgorithmstoabroaderclassofrelationaldatabasequeriesthatformthebasisofamorepowerfulfeatureconstructionsystem.
2.
Background2.
1FriendsNetworksfromUserProfilesSocialnetworkservicessuchasMySpaceandFacebookallowuserstolistinterestsandlinktofriends,sometimesannotatingtheselinksbydesignatingtrustlevelsorqualitativeratingsforselectedfriends.
Somesuchservices,suchasGoogle'sOrkut,arecommunity-centric;others,suchasthevideobloggingserviceYouTubeandthephotoserviceFlickr,emphasizesocialmedia;whilesome,suchasSixApart'sLiveJournalandVox,areorganizedaroundtext-and-imageweblogs.
LiveJournalanditsderivativeservices,suchasGreatestJournal,DeadJournal,andJournalFen,arebasedonthesameopen-sourceservercode.
Atthetimeofthiswriting,thereareover11.
7millionLiveJournalaccounts,1.
8millionofthemactive.
ThefriendsnetworkofLiveJournal,ourtopicofstudy,hastwovarietiesofaccounts:usersandcommunities(weomitRSSfeeds).
Oneadvantageouspropertyofitsdatamodel,stemmingfromacommonschemaforthetwoaccounttypes(whichcouldoriginallybeconvertedfromusertocommunity),isthatitprovidesasimple,flexiblerepresentationforentitiesandrelations.
StartEndLinkDenotesUserUserTrustorfriendshipUserCommunityReadershiporsubscribershipCommunityUserMembership,postingaccess,maintainerCommunityCommunityObsoleteTable1.
TypesoflinksintheblogserviceLiveJournal.
Table1showsthetypesoflinksinLiveJournalandtheirconstituentattributes.
Friendshipisanasymmetricrelationbetweentwoaccounts,eachrepresentedbyavertexinadirectedgraph.
Thetypeofthestartandendpointdefinestherelationshipsetattributesofthelink.
Forexample,auseruwhoaddsanotheruservtohisorherfriendslistcanspecifythemembershipinanyofupto30groups.
Theseservethedualpurposeofblogaggregation(postsfromeachgroup'smembersarefilteredintoitsaggregatorpage,whichucanreadormakepublic)andgroups-basedsecurity(eachgroupdenotesaread/commentaccesscontrollist).
Accesscontrollistsforcommunitiesareassociatedwithmemberships(community-to-userlinks),whilecontentiscontrolledbypostersorsubscribers.
Ausercan"watch"acommunityinordertoaddallaccessiblepoststoamainaggregatorpageortocustomgroups.
Thesetofaccessiblepostsconsistsofeitherpublicpostsonly,orpublicandrestricted(members-only)posts.
Theaccesscontrollistisdefinedbythemembershiprelationandindividualposters'selections(whethertoallowcommentsandwhethertodisplaythembydefaultfromnoreaders,allreaders,non-anonymousreaders,orcommunitymembers).
Acquisitionofprivilegesisacommunityproperty,ofwhichonlymembershipmaybeacquiredsolelybyuseraction("joining"acommunity),ifthemoderatorhasspecifiedopenmembership.
Figure1.
LiveJournalaccesscontrollistmaintenance(communitymoderatorinterface).
Thus,areciprocallinkbetweenauserandacommunitymeansthattheuserbothsubscribestothecommunityandisanapprovedmember.
Linksfromuserutovarelistedinthe"Friends"listofuandinanoptionallydisplayed"FriendsOf"listofv.
Thislistcanbepartitionedintoreciprocalandnon-reciprocalsublistsforauseru:MutualFriends:{v|(v,u)∈E∧(u,v)∈E}AlsoFriendOf:{v|(v,u)∈E∧(u,v)E}Thecommunityanalogueofthe"FriendsOf"lististhe"WatchedBy"(subscriber)list,whosemembershavethecommunitynamelistedinthe"Friends:Communities"sectionsoftheirindividualuserprofilepages.
Thecommunityanalogueofthe"Friends"lististhe"Members"list.
ThefriendsnetworkforLiveJournalconsistsofaverylargecentralconnectedcomponentandmanysmallislands,mostofwhicharesingletonusers.
Thereareafewsourcevertices,correspondingtoaccountsthatlinktoothersbuthavenoreciprocatedfriendships;theseareusuallyRSSorblogaggregatoraccountsownedbyindividuals.
Additionally,therearesinkverticescorrespondingtoaccountswatchedbyothers,butwhichhavenamednofriends.
Someofthesearechannelsforannouncementordisseminationofcreativework.
2.
2LinkIdentificationInpreviouswork[HKP+06],weintroducedalinkpredictionproblemforLiveJournal:givenagraphinwhichtheexistenceofacandidatelinkishidden(elidedifitexists),classifyitaspresentorabsentgivenallotherattributesofthegraphandoftheendpoints.
Ourinitialapproachtolinkidentificationconsistedofdividingfriendsnetworkfeaturesintographfeaturesandinterest-basedfeatures.
Graphfeaturescouldbecomputedsimplybyscanningthegraph,inthecaseofpair-distancemetrics,performingall-pairsshortestpath(APSP)search:1.
Indegreeofu:popularityoftheuser2.
Indegreeofv:popularityofthecandidate3.
Outdegreeofu:numberofotherfriendsbesidesthecandidate;saturationoffriendslist4.
Outdegreeofv:numberofexistingfriendsofthecandidatebesidestheuser;correlateslooselywithlikelihoodofareciprocallink5.
Numberofmutualfriendswsuchthatu→w∧w→v6.
"Forwarddeleteddistance":minimumalternativedistancefromutovinthegraphwithouttheedge(u,v)7.
BackwarddistancefromvtouinthegraphTheseweresupplementedbyinterest-basedfeatures:8.
Numberofmutualinterestsbetweenuandv9.
Numberofinterestslistedbyu10.
Numberofinterestslistedbyv11.
Ratioofthenumberofmutualintereststothenumberlistedbyu12.
Ratioofthenumberofmutualintereststothenumberlistedbyv2.
3EfficientfeatureanalysisThedegreeattributescanbeenumeratedintimelinearinthenumberofusers,ascanthemutualfriendscountforeachpairofusers.
Forwarddeleteddistancemeasuresthedistancefromutovbyalternateroutes,aftertheedge(u,v)iselided.
Thepredictiontaskisthustoreconstructtheincompletegraphresultingfromthiserasure,todeterminewhetheraparticularlink(u,v)existed.
ForwarddeleteddistancecanbeprecomputedexhaustivelyfortheentiregraphinΘ(|E|(|V|+|E|))=Θ(|E|2)timebyerasingeachedgeinEandre-runningabreadth-firstsearchfromthestartvertex.
Ifacandidateedgeisnotstoredintheresultingcache,itsdeleteddistanceisthatfoundbyBFSontheoriginalgraph,inΘ(|V|+|E|)time.
Inagraph(V,E),backwarddistancerequiresΘ(|V|+|E|)usingBFSforaparticularcandidateedge.
SincetheexpectedsizeoftheedgesetisE[|E|]=k|V|,aboutk=20onaverageacrossLiveJournal,thebottleneckcomputationisthatofforwarddeleteddistance:Θ(|E|2)=Θ(k2|V|2),orΘ(|V|2)withalargeconstant.
Usingastraightforwardstringpairenumerationandcomparisonalgorithm,themutualinterestcountsarestoredinmatrixof|V|2elements,eachrequiringconstanttimetocheck(givenamaximumof150interests).
previouswork[HKP+06],weintroducedalinkpredictionproblemforLiveJournal:givenagraphinwhichtheexistenceofacandidatelinkishidden(elidedifitexists),classifyitaspresentorabsentgivenallotherattributesofthegraphandoftheendpoints.
Ourinitialapproachtolinkidentificationconsistedofdividingfriendsnetworkfeaturesintographfeaturesandinterest-basedfeatures.
2.
4MethodologiesforlinkminingGetoorandDiehl[GD05]recentlysurveyedtechniquesforlinkmining,focusingonstatisticalrelationallearningapproachesandemphasizinggraphicalmodelsrepresentationsoflinkstructure.
Ketkaretal.
[KHC05]comparedataminingtechniquesovergraph-basedrepresentationsoflinkstofirst-orderandrelationalrepresentationsandlearningtechniquesthatarebaseduponinductivelogicprogramming(ILP).
SarkarandMoore[SM05]extendtheanalysisofsocialnetworksintothetemporaldimensionbymodelingchangeinlinkstructureacrossdiscretetimesteps,usinglatentspacemodelsandmultidimensionalscaling.
OneofthechallengesincollectingtimeseriesdatafromLiveJournalistheslowrateofdataacquisition,justasspatialannotationdata(suchasthatfoundinLJmapsandthe"plotyourfriendsonamapmeme)isrelativelyincomplete.
2.
5OtherapplicationsusinggraphminingPopesculandUngar[PU03]learnakindofentity-relationalmodelfromdatainordertopredictlinks.
Hill[Hi03]andBhattacharyaandGetoor[BG04]similarlyusestatisticalrelationallearningfromdatainordertoresolveidentityuncertainty,particularlycoreferencesandotherredundancies(alsocalleddeduplication).
Resigetal.
[RDHT04]usealarge(200000-user)crawlofLiveJournaltoannotateasocialnetworkofinstantmessagingusers,andexploretheapproachofpredictingonlinetimesasafunctionoffriendsgraphdegree.
Therehavebeennumerousrecentapplicationsofsocialnetworkminingbasedonthetextandheadersofe-mail.
OnenotableresearchprojectbyMcCallumetal.
[MCW05]usestheEnrone-mailcorpusandinfersrolesandtopiccategoriesbasedonlinkanalysisAprimarygoalofthisworkistoextendthegraphminingapproachbeyondlinkpredictionandrecommendationtowardslinkexplanationandannotation.
Itmaybemuchmoreusefultoexplainwhyagroupoffriendsinablogservicecreatedaccountsenmasseoraddedoneanotherasfriendsthantorecommendrelationshipsetsthatarealreadyextantorstructuredaccordingtoapreexistentsocialgroup.
Forexample,highschoolclassmatesoftencreateaccountsandencouragetheirpeerstojointhesameservice.
Inafewcases,thisisencouragedorfacilitatedbyateacher,foraclassproject.
Solvingtheproblemoflinkpredictionisnotparticularlyusefulinthiscase,becausetheuserdecisionshavealreadybeenmadeorstronglyconstrained;however,itmaybeveryusefultolinkotherclassmatesnotworkingonthesameprojecttothesamerelationshipset(perhapstheywereencouragedtojointheblogservicebystudentswhocontinuedtouseitaftertheclassproject).
Largegroupssuchaswebcomicsubscriberships,communityco-members,etc.
arealsosomewhatidentifiable,andrelatingmembersofablogservicetooneanotherthroughrelationshipsetsisatypicalentity-relationaldatamodelingoperationthatcanbemademorerobustandefficientthroughgraphfeatureextraction.
3.
ExperimentDesign3.
1LJCrawlerv2Toacquirethegraphstructureandattributesdescribeintheprevioussection,wedevelopedanHTTP-basedspidercalledLJCrawlertoharvestuserinformationfromLiveJournalAmultithreadedversionofthisprogram,whichretrievesBMLdatapublishedbyDenga(theownersofLiveJournal),collectsanaverageofupto15recordspersecond,traversingthesocialnetworkdepth-firstandarchivingtheresultsinamasterindexfile.
BecauseLiveJournal'sfunctionalityforlookingupusersbyusernumberisonlyavailabletoadministrators,wedecidedtocompilealistofseedsforadisjoint-setrepresentationofthedisconnectedsocialnetwork.
Forpurposesofthisexperiment,however,startingfromjustoneseed(thefirstauthor'sLiveJournalID)andrestrictingthecrawltooneconnectedcomponentwassufficient.
UsingLJCrawler,wecompiledanadjacencylistandthefollowinggroundfeaturesforeachuser:Accounttype(user,community)InterestlistSchoollistCommunitieswatchedlistCommunitymembershiplistFriendsoflistFriendslist3.
2FeatureAnalyzersWedefineasingleexampletobeacandidateedge(u,v)intheunderlyingdirectedgraphofthesocialnetwork,alongwithasetofdescriptivefeaturescalculatedfromtheannotatedgraphrecordedbyLJCrawler:Otherfeatures:Additionalplannedfeaturesforcontinuingexperimentsincludedates(updatefrequencieswhentakendifferentially),useroptionssuchasmaximumfriendscount,andcontentdescriptorsofLiveJournalentriesandcomments(averagepostlength,wordfrequency,etc.
).
3.
3GraphSearchAlgorithmsforComputingFeaturesComputingtheminimumforwardandbackwarddistancescanbedonemoreefficientlybyusingbreadth-firstsearch.
Currently,aJavaimplementationofthisalgorithmrequiresunderoneminuteona2GHzAMDOpteronsystemtoprocessa2000-nodegraph.
However,enumeratingallpossiblecandidatepairswithinaneighborhoodof2nodes(1.
6millionpairsfor4000nodes)requiresseveralhoursonthesamesystem.
WenotethattheamortizedcostofrunningBFStoprecomputeall-pairsshortestpaths(APSP)withtheactualedgedeleted(whichisnecessarytoavoidknowingthepredictiontargetinlinkpredicton)isΘ(|E|(|V|+|E|)).
Thisisprohibitivelylargeevenforour"mid-sized"subgraphsof10-50Knodes;when|V|isabout11million,|E|isalittleover200million,enumeratingAPSPiscompletelyinfeasible.
However,wedonottypicallyconsiderallofE,sothebottleneckistypicallythefirststepplusaconstantnumberofcallstoBFS,requiringrunningtimeinΘ(k(|V|+|E|)).
3.
4GeneratingCandidatesWeconsideredseveralalternativewaystogeneratecandidateedges(u,v):Thefirsttechniqueislikelytobeunscalable,asthenumberofcandidatesis|V|2.
ThesecondrequireshavingarepresentativelylargesampleofthefullLiveJournalsocialnetwork,inordertofitthedistributionparametersaccurately.
Thethirdwasthemoststraightforwardtoimplement.
Twocallstotheallpairsshortestpathalgorithmprovidedcostmatrix,andonepassateachradiusuptoamaximumof10yieldedthedatashowninTable2.
Tosimplifytheinitialexperiments,wedefinedtheclassificationproblemtobeclassificationofd(u,v)as1or2.
Thistaskisactuallyusefulforsocialnetworkrecommendersystemsbecausediscriminationofadirectfriendfroma"friendofafriend"(FOAF)isfunctionallysimilartorecommendingFOAFstolinktodirectly.
Therearemoredetailedclassificationtargets,suchasplacement,promotion,anddemotionoflinkedfriendswithinstrataoftrust(setting,increasing,anddecreasingthesecuritylevel),butchoosingauser'sfriendstobeginwithisthemorefundamentaldecision.
Table2andTable3reportthedistributionofinter-vertexdistancesinthefriendsnetworkfortwosubnetworksinducedbylimitingthemaximumnumberofnodes.
DistancedFrequency(=d)Cumulative(≤d)1620462042107307113511369896183407459926243333534002467336255246988716247004812470059001000∞9731256735Table2.
Numberofcandidateedgesforthe1000-nodeLiveJournalgraph.
DistancedFrequency(=d)Cumulative(≤d)1194101941023705683899783403075793053452037313134265123747143717361845314556267265714582838339145862292914586511001458651∞1745341633185Table3.
Numberofcandidateedgesforthe4000-nodeLiveJournalgraph.
4.
Results4.
1Preliminaryexperiment:941-nodeversionInapreliminaryexperiment,weconstructeda941-nodesubgraph,definingtheconceptIsFriendOfandtrainedthreetypesofinducerswith:1.
allattributes2.
allgraphattributesexcludingtheforwardandbackwarddistances3.
thebackwarddistancesalone4.
thebackwardandforwarddistancesalone5.
interest-relatedattributesalone.
Table4andTable5showtheresultsforthreeinducers:theJ48decisiontreeinducer,Holte's1Rinducer(asingle-ruleclassifierbasedonasingleattribute)[Ho93],andtheLogisticregressioninducer.
Allaccuracymeasureswerecollectedover10-foldcross-validatedruns.
TheJ48outputwthallfeaturesachievesasignificantboostoverthenexthighest(distanceonly).
InducerAllNoDistBkDistDistInterestJ4898.
294.
895.
897.
688.
5OneR95.
892.
095.
895.
888.
5Logistic91.
690.
988.
388.
988.
4Table4.
Percentaccuracyforpredictingallclassesusingthe941-nodegraph.
InducerAllNoDistBkDistDistInterestJ4889.
565.
767.
783.
05.
4OneR67.
741.
167.
767.
74.
5Logistic38.
333.
304.
54.
5Table5.
Precision(truepositivestoallpositives)usingthe941-nodegraph.
4.
2ExperimentsonrestrictedgraphsWedevelopedanapplication,ljclipper,torestricttheoverallfriendsgraphtothatinducedbyasubsetofnodesoffixednumber,foundusingbreadth-firstsearchstartingfromagivenseed.
Usinga4000-nodesubgraphsummarizedinTable3,wegenerated1633185candidateedges.
Notethatallforwarddistancesaregreaterthan1:whenuandvareactuallyconnected,weerase(u,v).
Inpreliminaryexperiments,wethencomputedthelengthoftheshortestalternativepath.
Thisis,however,alessscalableapproach,becausetheasymptoticrunningtimeisdominatedbythesuperlineartimerequiredtocomputeThecompletelistingofalltwelvefeaturesisgiveninSection2.
2.
Thenumericaltypesofallofthenetworkfeatures–boththeonesdescribingthegraphandthosemeasuringandinterestsandratios–makesdatasetamenabletologisticregression.
InducerAccuracyPrecisionRecallJ4899.
997.
596.
1OneR99.
691.
791.
8Table6.
Percentaccuracy,precisionandrecallusinga1000-nodegraph(10-foldCV).
InducerAccuracyPrecisionRecallJ4899.
895.
892.
0OneR99.
791.
189.
9Table7.
Percentaccuracy,precisionandrecallusinga2000-nodegraph(10-foldCV).
InducerAccuracyPrecisionRecallJ4899.
894.
588.
3OneR99.
788.
284.
3Table8.
Percentaccuracy,precisionandrecallusinga4000-nodegraph(10-foldCV).
Table6throughTable8showtheaccuracy,precision,andrecallforthe1000,2000,and4000-nodefriendsgraphs.
Trendsofhigherprecisionthanrecall,anddiminishingprecisionandrecallasthenetworkgrowslarger,areobserved.
Thesetrendsaresustainedforsubsamplesofsize10000andsize100000,thoughprecisionandrecallalsodiminishslightlywithsampling.
4.
3DataacquisitionandlargerexperimentsThecrawlerhasbeenimprovedwithseveralservice-specificoptimizationsforfetchinguserinfopages.
PresentlythesedonotuseLiveJournal'sBMLfeedofuserdata,whichisincompleteforourpurposes(thatis,notallgroundattributesinourinitialrelationsareprovided).
Atpresstime,thiscrawlerprocessesabout20000userrecordsperhourandthuswouldrequireoveraweektocrawlLiveJournal.
ThecurrentbottleneckistheΘ(|V|(|V|+|E|))stepdescribedinSection3.
3.
Thisisthedominantterm,becausetheconstantkdenotingthenumberofcandidateedgesisusuallymuchsmallerthann,e.
g.
,100-1000,sothatΘ(k(|V|+|E|))isnotonlyinΘ(|V|+|E|),butactuallyjustafewhundredtimesthecostofasingleBFS.
4.
4InterpretationUsingmutualinterestsalone,evenwithnormalizationbasedonthenumberofinterestsinuandv,resultsinverypoorpredictionaccuracyusingallinducerswithwhichweexperimented.
Intermediateresultsareachievedusingmutualfriendscountanddegree(NoDist:65.
7%onpredictingedges)andusingforwarddeleteddistanceandbackwarddistance(Dist:67.
7%).
Usingall12computedgraphandannotationfeaturesresultedinthehighestprecision(All:89.
5%)andaccuracy(All:98.
2%).
WenotethatLiveJournalonceusedavariantofnormalizedmutualintereststoproducealistofpotentialfriends,arrangedindecreasingorderofmatchquality.
AlthoughthiswasnotthesametypeofrecommendersystemasLJMinersupports,itshowsthatthestateoftheartusermatchingsystemshavealotofroomforimprovement.
TheresultsindicatethatfeaturesproducedbyLJMiner,usedwithagoodinducer,cangeneratecollaborativeandstructuralrecommendations.
5.
ContinuingWorkScalingup:Ourcurrentresearchfocusesonscalinguptotensofthousandsandeventuallymillionsofusers.
Crawlingover11-12millionrecordsisatleasttechnicallyfeasible,butscalingupthegraphanalyzersisachallengethatmaybestbemetwithheuristicsearch.
Learningrelationalmodels:Apromisingareaofresearchistherecoveryofrelationalgraphicalmodels,includingclass-level(membershipandreferenceslot)uncertainty.
[GFKT02]LJMinerhasyieldedareadysourceofsemistructureddataforbothstructurelearninganddistributionlearning.
Anotherpotentiallyusefulapproachistoorganizeusersandcommunitiesintoclustersusingthisrelationalmodel.
Wehavedevelopedschemasforblogposts(entries,threads,comments)andforusersanddynamicgroupsofusers.
Thisisrelatedtopreviouspreliminaryworkonrelationaldataminingforpersonalizationofwebportals,especiallycomputationalgridportals.
[HBJ03].
Muchoftherelationalmetadatainthebioinformaticsdomaincomesfromdescriptionlanguagesforworkflowsandworkflowcomponents[Hs04].
Thenextstepinourexperimentalplanistouseschemassuchasourdetailedonesforblogseviceusersandbioinformaticsinformationandcomputationalgridusers[Hs05]tolearnaricherpredictivemodel.
Finally,modelingrelationaldataasitpersistsorchangesacrosstimeisanimportantchallenge.
AcknowledgementsWethankToddEastonandKirstenHildrumforhelpfuldiscussionsconcerningalgorithmsandtheLiveJournaldatamodel.
WealsothankAndrewKingandTejaswiPydimarriforcontributionstotheoriginalLJMinersystemandVikasBahirwaniforcontributionstothesecondversion.
References[BG04]I.
Bhattacharya&L.
Getoor.
Deduplicationandgroupdetectionusinglinks.
InProceedingsoftheACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining(KDD)WorkshoponLinkAnalysisandGroupDetection(LinkKDD2004),Seattle,WA,USA,August22-25,2004.
[CLRS02]T.
H.
Cormen,C.
E.
Leiserson,R.
L.
Rivest,&C.
Stein.
IntroductiontoAlgorithms,SecondEdition.
Cambridge,MA:MITPress,2002.
[GD05]L.
Getoor&C.
P.
Diehl.
Linkmining:asurvey.
SIGKDDExplorations,SpecialIssueonLinkMining,7(2):3-12.
[GFKT02]L.
Getoor,N.
Friedman,D.
Koller,&B.
Taskar.
LearningProbabilisticModelsofLinkStructure.
JournalofMachineLearningResearch,2002.
[HBJ03]W.
H.
Hsu,P.
Boddhireddy,&R.
Joehanes.
Usingprobabilisticrelationalmodelsforcollaborativefiltering.
InProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI)WorkshoponStatisticalLearningofRelationalModels(SRL),Acapulco,MEXICO,August,2003.
[Hi03]S.
Hill.
SocialnetworkrelationalvectorsforanonymousidentitymatchingInProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI)WorkshoponStatisticalLearningofRelationalModels(SRL),Acapulco,MEXICO,August,2003.
[Ho93]R.
C.
Holte.
VerySimpleClassificationRulesPerformWellonMostCommonlyUsedDatasets.
MachineLearning,11(1):63-90.
[Hs04]W.
H.
Hsu.
Relationalgraphicalmodelsofcomputationalworkflowsfordatamining.
InProceedingsoftheInternationalConferenceonSemanticsofaNetworkedWorld:SemanticsforGridDatabases(ICSNW-2004),p.
309-310,Paris,FRANCE,June,2004.
[Hs05]W.
H.
Hsu.
Relationalgraphicalmodelsforcollaborativefilteringandrecommendationofcomputationalworkflowcomponents.
InProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI)WorkshoponMulti-AgentInformationRetrievalandRecommenderSystems,Edinburgh,UK,July31,2005.
[HKP+06]W.
H.
Hsu,A.
King,M.
S.
R.
Paradesi,T.
Pydimarri,&T.
Weninger.
CollaborativeandStructuralRecommendationofFriendsusingWeblog-basedSocialNetworkAnalysis.
InProceedingsofthe2006AAAISpringSymposiumonComputatationalApproachestoAnalyzingWeblogs(CAAW2006).
[KHC05]N.
S.
Ketkar,L.
B.
Holder,&D.
J.
Cook.
Comparisonofgraph-basedandlogic-basedmulti-relationaldatamining.
SIGKDDExplorations,SpecialIssueonLinkMining,7(2):64-71.
[Ko01]D.
Koller.
Representation,ReasoningandLearning.
IJCAIComputersandThoughtAwardLecture,2001.
[MCW05]A.
McCallum,A.
Corrada-Emmanuel,&X.
Wang.
Topicandrolediscoveryinsocialnetworks.
InProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI),Edinburgh,UK,August,2005.
[MH04]M.
Mukherjee&L.
B.
Holder.
Graph-baseddataminingonsocialnetworks.
InProceedingsoftheACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining(KDD)WorkshoponLinkAnalysisandGroupDetection(LinkKDD2004),Seattle,WA,USA,August22-25,2004.
[PU03]A.
Popescul&L.
H.
Ungar.
Statisticalrelationallearningforlinkprediction.
InProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI)WorkshoponStatisticalLearningofRelationalModels(SRL),Acapulco,MEXICO,August,2003.
[RDHT04]J.
Resig,S.
Dawara,C.
M.
Homan,&A.
Teredesai.
Extractingsocialnetworksfrominstantmessagingpopulations.
InProceedingsoftheACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining(KDD)WorkshoponLinkAnalysisandGroupDetection(LinkKDD2004),Seattle,WA,USA,August22-25,2004.
[SM05]P.
Sarkar&A.
Moore.
Dynamicsocialnetworkanalysisusinglatentspacemodels.
SIGKDDExplorations,SpecialIssueonLinkMining,7(2):31-40.

CloudCone中国春节优惠活动限定指定注册时间年付VPS主机$13.5

CloudCone 商家产品还是比较有特点的,支持随时的删除机器按时间计费模式,类似什么熟悉的Vultr、Linode、DO等服务商,但是也有不足之处就在于机房太少。商家的活动也是经常有的,比如这次中国春节期间商家也是有提供活动,比如有限定指定时间段之前注册的用户可以享受年付优惠VPS主机,比如年付13.5美元。1、CloudCone新年礼物限定款仅限2019年注册优惠购买,活动开始时间:1月31...

百纵科技,美国独立服务器 E52670*1 32G 50M 200G防御 899元/月

百纵科技:美国高防服务器,洛杉矶C3机房 独家接入zenlayer清洗 带金盾硬防,CPU全系列E52670、E52680v3 DDR4内存 三星固态盘阵列!带宽接入了cn2/bgp线路,速度快,无需备案,非常适合国内外用户群体的外贸、搭建网站等用途。C3机房,双程CN2线路,默认200G高防,3+1(高防IP),不限流量,季付送带宽美国洛杉矶C3机房套餐处理器内存硬盘IP数带宽线路防御价格/月套...

SugarHosts糖果主机六折 云服务器五折

也有在上个月介绍到糖果主机商12周年的促销活动,我有看到不少的朋友还是选择他们家的香港虚拟主机和美国虚拟主机比较多,同时有一个网友有联系到推荐入门的个人网站主机,最后建议他选择糖果主机的迷你主机方案,适合单个站点的。这次商家又推出所谓的秋季活动促销,这里一并整理看看这个服务商在秋季活动中有哪些值得选择的主机方案,比如虚拟主机最低可以享受六折,云服务器可以享受五折优惠。 官网地址:糖果主机秋季活动促...

graphsearch为你推荐
中南财经政法大学知识产权研究中心usergooglepreloadedbaidu支持ipad支持ipadeaccelerator开启eAccelerator内存优化就各种毛病,DZ到底用哪个内存优化比较好。。。xp如何关闭445端口Windows XP 怎么关闭445端口,我是电脑小白,求各位讲详细点win10445端口win7系统不能被telnet端口号,端口、服务什么全都开了ipad上网为什么我的ipad 显示无法连接到网络win7telnetwindows7旗舰版中telnet在哪
提供香港vps 132邮箱 enom ibrs java虚拟主机 789电视 国外代理服务器地址 免费私人服务器 华为云盘 空间登录首页 免费网络空间 服务器硬件配置 双十二促销 石家庄服务器 免费赚q币 windowssever2008 fatcow 服务器操作系统 卡巴斯基免费版下载 rewrite规则 更多