leadcentos6.0

centos6.0  时间:2021-03-27  阅读:()
TheNewAlgorithmoftheItem-basedonMapReduceZHAOWei1,a1CollegesoftwareTechnologySchool,ZhengzhouUniversityZhengzhou450002,Chinaaiezhaowei@163.
comKeywords:RecommendationsystemparallelcomputingClusteringAbstract.
TraditionalcollaborativefilteringalgorithmbasedonitemandK-meansclusteringalgorithmarestudied,theparallelalgorithmofcollaborativefilteringItem-basedonMapReduceisproposedbyusingMapReduceprogrammingmodel.
Thealgorithmismainlydividedintotwosteps,onestepisK-Meansalgorithmclusteringforusers,anotherstepistheparallelItem-basedalgorithmforclusteringuserrecommendation.
Experimentalresultsshowthatthealgorithmhasobtainedverygoodeffect,improvedtherunningspeedandexecutionefficiency,theimprovedalgorithmismuchsuitableforprocessingbigdata.
IntroductionBigdatausuallyincludesdatasetswithsizesbeyondtheabilityofcommonlyusedsoftwaretoolstocapture,curate,manage,andprocessdatawithinatolerableelapsedtime.
Bigdataishighvolume,highvelocity,and/orhighvarietyinformationassetsthatrequirenewformsofprocessingtoenableenhanceddecisionmaking,insightdiscoveryandprocessoptimization.
Volumemeansbigdatadoesn'tsample;itjustobservesandtrackswhathappens;Velocitymeansbigdataisoftenavailableinreal-time;Varietymeansbigdatadrawsfromtext,images,audio,video;plusitcompletesmissingpiecesthroughdatafusion[1].
Therefore,thebigdatamustbethroughthecomputerstatistics,comparison,analysisofthedatacanbetheobjectiveresults.
Nowelectroniccommercesystemsofeverytransaction,everyinputandeverysearchcanasdata,datathroughthecomputersystemtodothescreening,sorting,analysis,sothattheanalysisresultsisnotonlyanobjectiveconclusion,moreabletohelpbusinessprovidedthedecision-makingofenterprisesandalsocollectedusefuldatacanalsobereasonableplanning,activelyguidethedevelopmentoflargerpowerconsumption,andmoreeffectivemarketingandpromotion.
Withtheincreasingamountofdataintheelectroniccommercesystem,theneedforalargenumberofdatadepthanalysisisincreasinglyurgent.
Therefore,theuseofasimpleandhighscalabilityoftheprogramfortheanalysisofproductrecommendationisparticularlyimportant.
Atpresentdomesticmanyecommercesitesusecollaborativefilteringalgorithm,suchasAmazon,Dangdang,collaborativefilteringalgorithmismainlydividedintobasedontheitemsofthecollaborativefilteringalgorithmanduserbasedcollaborativefilteringalgorithm.
Basedonitemsofcollaborativefilteringalgorithmistomeasurethesimilaritybetweenitemsaccordingtotheuser'spreferences,donotneedtoconsidertheitemspecificcontentfeatures,sothealgorithmismainlyusedine-commercerecommendationandmovierecommendationdomain,thealgorithmwhileinthefieldofelectroniccommercerecommendationhasbeenacertaindegreeofsuccess.
Butinmassivedataarerecommendedwhenthedataisrecommendedperformanceisnothighandthedatainformationlackofsharingandextendedtheleadtothehardwarerequirementscomparedhigherinherentshortcomingsmakeitdidnotreceiveapromotionandsupportofenterpriseelectroniccommerce[2].
SoifweuseMapReducetoachievedistributedparallelcomputing,itwillgreatlyimprovetheefficiencyandperformanceofthealgorithm,andpromotethefurtherdevelopmentofthealgorithm[3-4].
Basedontheitemsofthecollaborativefilteringalgorithmisaccordingtoitemsimilarityanduserhistoryaccessrecordrecommendedtotheusertogeneratealistofitems,buttherearesomesmallproblems,suchasdatasparsityproblemandwhenthemassofusersandthenumberofitems,theuserbehaviorandrecorddatawillgreatly,andthealgorithmforcomputingitemswithsimilarmatrixcostgreatly,algorithmefficiencyandperformancewillgreatlyreduce.
Aimingattheaboveproblems,theclusteringalgorithmhasalsobeenappliedtoacollaborativefilteringalgorithmbasedonitem,themassiveuserclusteringanalysis,soitcanavoidthequestioncarefully,foreachusertorecommendoperation.
Thefirstshoppinguserswithsimilarinterestsintoauserclass,withaclusterofuserrecommendedgoodsarethesame.
Thesecondistoreducethemassiveuserdimensionsbecomedozensofclusteringlimited,thetimecomplexityencounteredabottleneck,andtheparallelclusteringalgorithmusingMapReduceistheeffectivewaytosolvethebottleneck[5].
MapReduceisadistributedprogrammingmodelframeworkonHadoopplatform,intheconditionofnotfamiliarwiththeunderlyingdetailsofthedistributedimplementationoftheimplementationoftheprogram[6].
TheMapReduceasparallelcomputingprogrammingmodel,firstofalltousersofMapReducebasedparallelclusteringandaccordingtotheresultsofuserclustering,ineveryuserclassusingtheMapReduceparallelcollaborativefilteringrecommendation,eventuallygiveusersareasonablepersonalizedcommodityrecommendationlist.
Therunningtimeofdifferentnodesinthequantitativedataiscomparedwiththenewalgorithm.
Theresultsshowthatthedataprocessingperformanceoftheproposedalgorithmisgreatlyimproved.
TheprincipleofMapReduceprogrammingmodelMapReduceisinHadoopplatformbyusingparallelcomputingprogrammingmodel,thetechniqueisproposedbyGoogleforatypicaldistributedparallelprogrammingmodel,theuserintheMapReducemodeldevelopthemapandreducefunctions,canrealizetheparallelprocessing.
Mapwillberesponsiblefordatadispersion,Reduceisresponsiblefordataaggregation.
UsersonlyneedtoachieveMapandReducetwointerface,youcancompletethecalculationofTBleveldata.
BecauseoftheMapReducemodel,thedetailsoftheparallelandfault-tolerantprocessingareencapsulated,whichmakesprogrammingveryeasytoimplement.
MapReduceparallelcalculationisdividedintotwoparts,thefirststepisinitializingtheoriginalinputdatafileandthedatasetisdividedintoapluralityofacertainsizeofdatablock,facilitateparallelcomputing;thesecondstepistostartthemapandreducefunctionsalgorithmofparallelcomputing,finallyproducedthefinalresult.
Figure1ParallelflowchartofMapReduceKeytechnologyresearchandImplementation1.
ThebasicideaofthetraditionalcollaborativefilteringalgorithmbasedonItem-basedThetraditionalbasedonitemsofcollaborativefilteringalgorithmthebasicideaisdividedintothreeparts,thefirstpartistocomputethesimilaritybetweenitems,commonsimilaritycalculationmethodwithcosinesimilarity,Pearsoncorrelationcoefficient,Tanmotocoefficientcorrelationof.
ThispaperselectstheEuclideansimilarityalgorithm,asfollows:TheassumptionisthatthereisavectorXandavectorY:X=(1x,2x,3x),Y=(1y,2y,3y),UsingtheEuclideansimilarityalgorithmtocalculatethesimilaritybetweenXandYSvector(x,y)formulaisasfollows[7]:1(,)1(,)Sxydxy=+(1)Where(,)dxyisthedistancebetweenthevectorXandY,thecalculationformulaisasfollows:222231123(dxyxyyyxx2)Thesecondpartistocalculatetheuserratingsmatrixontheitemsofthegoodsaccordingtothesimilaritymatrix;thethirdpartistheitemsimilaritymatrixWandtheusersoftheitemscorematrixmultiplicationtoobtaintherecommendationresults.
TraditionalItem-Basedcollaborativefilteringrecommendationalgorithmbasedonitemisthestagethataffectstheperformanceofthealgorithm.
Ifthenumberofusersisn,thenumberofcommodityitemsism,thetimecomplexityoffindingalltheitemsinthenprojectisO(2m),thetotalsearchspaceisnusers,sothetimecomplexityofcomputingsimilarityisO(2nm).
Sowhencalculatingthesimilaritymatrixofitems,itisindependentofthesimilaritybetweenthecalculatedandtheotherpairofitemstoaproject,soitispossibletocalculatethesimilaritymatrix.
2.
AnewalgorithmofItem-basedbasedonMapReduceThenewalgorithmismainlydividedintotwosteps;thefirststepistheMapReduceimplementationofK-Meansalgorithmbasedonclusteringofusers.
ThesecondstepistoachievetheparallelrecommendationalgorithmofItem-basedonMapReduce,theproductofuserclusteringrecommendation.
2.
1ThenewalgorithmK-MeansbasedonMapReduceThebasicideaofthetraditionalK-meansclusteringalgorithm:fromMdataobjectsinarbitrarychoiceofKobjectsastheinitialclustercenters;fortherestoftheotherobjects,accordingtotheirdistanceandtheclustercenters,respectively,theyallocatedtoitsmostsimilarclustering;thencalculateeachreceivedanewclusteringalgorithmclusteringcenter;keeprepeatingtheprocessuntilnochangesinacore.
Inthek-meansalgorithmtocalculatethedistancebetweendataobjectsandclustercentersisthemosttime-consumingoperation.
ThedataobjectandKclustercenterdistancecomparisonatthesametime,datafromotherobjectscanalsobecomparedwiththeKdistanceofthecenterofcluster,sotheoperationcanbeparallelized[8]BasedonMapReduceparallelimplementationofK-meansalgorithmcanimprovethespeedoftheclusteringalgorithm,isdividedintothreesteps:thefirststep:themapfunction,foreverypointcalculationrecentlythecenterdistanceandthecorrespondingtothenearestclustercenter.
Thesecondstep:Combinefunction,justcompletedtheMapmachineonthemachinearecompletedwiththesamepointoftheclusterpointofsummation,reducetheamountofcommunicationandcomputationofReduceoperation.
ThisstepisthekeytotheuseofCombinefunctiononthemachineonthefirstofthesameclustermerge,reducedtotheReducefunctionofthetransferandtheamountofcomputation.
Thethirdstep:theReducefunction,theintermediatedataofeachclustercenterwillbeformedandthenewclustercentercanbeobtained.
Eachiterationisrepeatedonthethreestep.
Figure2ParallelFlowChartofK-meansAlgorithmbasedonMapReduce2.
2thecollaborativefilteringalgorithmbasedonMapReduceforparallelimplementationofItem-basedBasedonthesimilaritycalculationformulamentionedabove(1),thispaperpresentsacollaborativefilteringrecommendationalgorithmbasedonMapReduce.
Algorithm1ThecollaborativefilteringrecommendationalgorithmbasedonMapReduceINPUT:Userinformationfile,Iteminformationfile,IntendeduserOUTPUT:IntendeduserrecommendedlistTheprocessisasfollows:Step1:Transformingtheuservectorintoanitemvector;Step2:Parallelcalculationofthesimilaritybetweenitems;thecalculationofthesimilaritybetweenitemsaccordingtotheformula(2)tocalculate;Step3:Similaritymatrixofparallelcomputingobjects;Step4:Parallelcomputinguserratingmatrix;inthecalculationoftheuser'sscoringmatrix,iftheuserisnotontheitemstoomuch,thenthedefaultscoreis1;Step5:Theresultsobtainedbythemultiplicationofthesimilaritymatrixofparallelcomputingobjectsandtheuser'sscorematrixarerecommended.
Experimentalresultanalysis1.
experimentalenvironmentThesimulationexperimentusingVMware_Workstation_10.
0.
3,virtualizationsoftwaretovirtualHadoopcloudplatform.
EightvirtualmachinesareinstalledonthevirtualHadoopcloudplatform,andaHadoopclusterenvironmentisbuiltontheseeightvirtualmachines.
OneofthevirtualmachineasagoodJobTrackernodeNameNode,theothersevenvirtualmachinesdeployedTaskTrackerandDataNode.
Thesemachinesareinthesamelocalareanetwork.
Theexperimentuseseightsetsofvirtualmachinehardwareconfigurationandsoftwareconfigurationasshownintable1:Table1HadoopClusterConfigurationOSCentos6.
4JDKVersion1.
6.
0Hadoop1.
1.
2HardWare2GRAM100GHardDisk2.
ExperimentandanalysisBasedonMapReduceparallelimplementationofItem-basedcollaborativefilteringalgorithminparallelmodeexpansionrateperformancecomparisontest,selectthesizeofthedataset,respectively,intheefficiencyof1-8nodesrunning.
Theexperimentalresultsareshownbelow:Figure3PerformanceTestChartFigure3isbasedonMapReduceparallelimplementationofitembasedcollaborativefilteringalgorithmcantestchart,theXaxisisthenumberofclients,they-axisistheresponsetimeofthesystem.
TheexperimentalresultsshowthatbasedonMapReduceparallelimplementationofitembasedcollaborativefilteringalgorithmperformancecomparedtothetraditionalrecommendationalgorithmissignificantlyimproved.
ConclusionInthispaper,anewalgorithmofcollaborativefilteringalgorithmbasedonMapReduceisproposed.
Theexperimentresultsshowthatthenewalgorithmhashighefficiencyandcanachievehighperformanceatalowcost.
Butinthispaper,theuserclusteringiscompletedonthebasisoftheuserwithasmallnumberofattributes,forhighdimensionalattributesoftheusergroups,butalsotodofurtherresearch.
Inadditiontothenewalgorithminthispaperhasbeenputforward,wewillcontinuetoimprovetheexperimentalmethod,andconstantlyimprovetheaccuracyoftherecommendationalgorithm.
References[1]Chenruming,Challenges,valuesandcopingstrategiesintheeraofbigdata[J].
MobileCommunications.
2012(17):14-15.
[2]SunLingfang,ZhangJing.
ElectronicrecommendationmechanismbasedonRFMmodelandcollaborativefiltering[J].
JournalofJiangsuUniversityofScienceandTechnology(NaturalScienceEdition).
2010,24(3):285-289.
[3]LIGai,PANRong.
etCollaborativefilteringalgorithmparallelizeresearchbasedonlargedatasetsa[J].
ComputerEngineeringandDesign,2012,33(6):2437-2441.
[4]LIWenhai;XUShuren;DesignandimplementationofrecommendationsystemforE-commerceonHadoop[J].
ComputerEngineeringandDesign,2014(35):131-136.
[5]SUNTianhao,LIAnnenget.
ResearchonDistributedCollaborativeFilteringRecommendationAlgorithmBasedonHadoop[J].
ComputerEngineeringandApplications,2014,51(15):124:128[6]XieXuelian,LiLanyou.
ResearchonParallelK-meansAlgorithmBasedonCloundComputingPlatform[J].
ComputerMeasurement&Control,2014,22(5):1510-1512.
[7]YanCun,JiGenlin.
DesignandImplementationofItem-BasedParallelCollaborativeFilteringAlgorithm[J].
JOURNALOFNANJINGNORMALUNIVERSITY(NaturalScienceEdition),2014,37(1):71-75.
[8]WAGNFei,QinXiaolin.
Algorithmfork-meansBasedonDataStreaminCloudComputing[J].
ComputerScience,2015,42(11):235:239.

bgpto:BGP促销,日本日本服务器6.5折$93/月低至6.5折、$93/月

bgpto怎么样?bgp.to日本机房、新加坡机房的独立服务器在搞特价促销,日本独立服务器低至6.5折优惠,新加坡独立服务器低至7.5折优惠,所有优惠都是循环的,终身不涨价。服务器不限制流量,支持升级带宽,免费支持Linux和Windows server中文版(还包括Windows 10). 特色:自动部署,无需人工干预,用户可以在后台自己重装系统、重启、关机等操作!bgpto主打日本(东京、大阪...

HostKvm四月优惠:VPS主机全场八折,香港/美国洛杉矶机房$5.2/月起

HostKvm是一家成立于2013年的国外主机服务商,主要提供基于KVM架构的VPS主机,可选数据中心包括日本、新加坡、韩国、美国、中国香港等多个地区机房,均为国内直连或优化线路,延迟较低,适合建站或者远程办公等。本月商家针对全场VPS主机提供8折优惠码,优惠后美国洛杉矶VPS月付5.2美元起。下面列出几款不同机房VPS主机产品配置信息。套餐:美国US-Plan0CPU:1cores内存:1GB硬...

DMIT(季度$28.88)调整洛杉矶CN2 GIA优化端口

对于DMIT商家已经关注有一些时候,看到不少的隔壁朋友们都有分享到,但是这篇还是我第一次分享这个服务商。根据看介绍,DMIT是一家成立于2017年的美国商家,据说是由几位留美学生创立的,数据中心位于香港、伯力G-Core和洛杉矶,主打香港CN2直连云服务器、美国CN2直连云服务器产品。最近看到DMIT商家有对洛杉矶CN2 GIA VPS端口进行了升级,不过价格没有变化,依然是季付28.88美元起。...

centos6.0为你推荐
百度爱好者如何加入知道记者团,有什么条件吗,加入以后都干些什么?蓝瘦香菇被抢注蓝瘦香菇这梗是怎么火起来的?怎么觉得火得莫名其妙?8080端口如何关闭8080端口沙滩捡12块石头价值近百万朋友从内蒙古阿拉善那边的戈壁捡了很多石头,求大神们鉴定一下,据说那边产玛瑙。谢谢大神们,大大的悬赏美国互联网瘫痪美国是否有能力关闭全球互联网以及中国互联网,还有美国有没能力关闭某个网站,比如淘宝,天涯,网易等access数据库Access数据库对象的操作包括哪五种?地陷裂口地陷是由什么原因引起的曹谷兰曹谷兰事件 有吧友知道吗丑福晋历史上真正的八福晋是什么样子的?8090lu.com《8090》节目有不有高清的在线观看网站啊?
双线虚拟主机 免费com域名申请 草根过期域名 Vultr mediafire 名片模板psd 商家促销 java空间 个人免费空间 空间出租 刀片服务器是什么 免费高速空间 美国免费空间 服务器监测 中国电信宽带测速器 华为云服务登录 路由跟踪 广州虚拟主机 双线空间 云服务是什么意思 更多