leadcentos6.0

centos6.0  时间:2021-03-27  阅读:()
TheNewAlgorithmoftheItem-basedonMapReduceZHAOWei1,a1CollegesoftwareTechnologySchool,ZhengzhouUniversityZhengzhou450002,Chinaaiezhaowei@163.
comKeywords:RecommendationsystemparallelcomputingClusteringAbstract.
TraditionalcollaborativefilteringalgorithmbasedonitemandK-meansclusteringalgorithmarestudied,theparallelalgorithmofcollaborativefilteringItem-basedonMapReduceisproposedbyusingMapReduceprogrammingmodel.
Thealgorithmismainlydividedintotwosteps,onestepisK-Meansalgorithmclusteringforusers,anotherstepistheparallelItem-basedalgorithmforclusteringuserrecommendation.
Experimentalresultsshowthatthealgorithmhasobtainedverygoodeffect,improvedtherunningspeedandexecutionefficiency,theimprovedalgorithmismuchsuitableforprocessingbigdata.
IntroductionBigdatausuallyincludesdatasetswithsizesbeyondtheabilityofcommonlyusedsoftwaretoolstocapture,curate,manage,andprocessdatawithinatolerableelapsedtime.
Bigdataishighvolume,highvelocity,and/orhighvarietyinformationassetsthatrequirenewformsofprocessingtoenableenhanceddecisionmaking,insightdiscoveryandprocessoptimization.
Volumemeansbigdatadoesn'tsample;itjustobservesandtrackswhathappens;Velocitymeansbigdataisoftenavailableinreal-time;Varietymeansbigdatadrawsfromtext,images,audio,video;plusitcompletesmissingpiecesthroughdatafusion[1].
Therefore,thebigdatamustbethroughthecomputerstatistics,comparison,analysisofthedatacanbetheobjectiveresults.
Nowelectroniccommercesystemsofeverytransaction,everyinputandeverysearchcanasdata,datathroughthecomputersystemtodothescreening,sorting,analysis,sothattheanalysisresultsisnotonlyanobjectiveconclusion,moreabletohelpbusinessprovidedthedecision-makingofenterprisesandalsocollectedusefuldatacanalsobereasonableplanning,activelyguidethedevelopmentoflargerpowerconsumption,andmoreeffectivemarketingandpromotion.
Withtheincreasingamountofdataintheelectroniccommercesystem,theneedforalargenumberofdatadepthanalysisisincreasinglyurgent.
Therefore,theuseofasimpleandhighscalabilityoftheprogramfortheanalysisofproductrecommendationisparticularlyimportant.
Atpresentdomesticmanyecommercesitesusecollaborativefilteringalgorithm,suchasAmazon,Dangdang,collaborativefilteringalgorithmismainlydividedintobasedontheitemsofthecollaborativefilteringalgorithmanduserbasedcollaborativefilteringalgorithm.
Basedonitemsofcollaborativefilteringalgorithmistomeasurethesimilaritybetweenitemsaccordingtotheuser'spreferences,donotneedtoconsidertheitemspecificcontentfeatures,sothealgorithmismainlyusedine-commercerecommendationandmovierecommendationdomain,thealgorithmwhileinthefieldofelectroniccommercerecommendationhasbeenacertaindegreeofsuccess.
Butinmassivedataarerecommendedwhenthedataisrecommendedperformanceisnothighandthedatainformationlackofsharingandextendedtheleadtothehardwarerequirementscomparedhigherinherentshortcomingsmakeitdidnotreceiveapromotionandsupportofenterpriseelectroniccommerce[2].
SoifweuseMapReducetoachievedistributedparallelcomputing,itwillgreatlyimprovetheefficiencyandperformanceofthealgorithm,andpromotethefurtherdevelopmentofthealgorithm[3-4].
Basedontheitemsofthecollaborativefilteringalgorithmisaccordingtoitemsimilarityanduserhistoryaccessrecordrecommendedtotheusertogeneratealistofitems,buttherearesomesmallproblems,suchasdatasparsityproblemandwhenthemassofusersandthenumberofitems,theuserbehaviorandrecorddatawillgreatly,andthealgorithmforcomputingitemswithsimilarmatrixcostgreatly,algorithmefficiencyandperformancewillgreatlyreduce.
Aimingattheaboveproblems,theclusteringalgorithmhasalsobeenappliedtoacollaborativefilteringalgorithmbasedonitem,themassiveuserclusteringanalysis,soitcanavoidthequestioncarefully,foreachusertorecommendoperation.
Thefirstshoppinguserswithsimilarinterestsintoauserclass,withaclusterofuserrecommendedgoodsarethesame.
Thesecondistoreducethemassiveuserdimensionsbecomedozensofclusteringlimited,thetimecomplexityencounteredabottleneck,andtheparallelclusteringalgorithmusingMapReduceistheeffectivewaytosolvethebottleneck[5].
MapReduceisadistributedprogrammingmodelframeworkonHadoopplatform,intheconditionofnotfamiliarwiththeunderlyingdetailsofthedistributedimplementationoftheimplementationoftheprogram[6].
TheMapReduceasparallelcomputingprogrammingmodel,firstofalltousersofMapReducebasedparallelclusteringandaccordingtotheresultsofuserclustering,ineveryuserclassusingtheMapReduceparallelcollaborativefilteringrecommendation,eventuallygiveusersareasonablepersonalizedcommodityrecommendationlist.
Therunningtimeofdifferentnodesinthequantitativedataiscomparedwiththenewalgorithm.
Theresultsshowthatthedataprocessingperformanceoftheproposedalgorithmisgreatlyimproved.
TheprincipleofMapReduceprogrammingmodelMapReduceisinHadoopplatformbyusingparallelcomputingprogrammingmodel,thetechniqueisproposedbyGoogleforatypicaldistributedparallelprogrammingmodel,theuserintheMapReducemodeldevelopthemapandreducefunctions,canrealizetheparallelprocessing.
Mapwillberesponsiblefordatadispersion,Reduceisresponsiblefordataaggregation.
UsersonlyneedtoachieveMapandReducetwointerface,youcancompletethecalculationofTBleveldata.
BecauseoftheMapReducemodel,thedetailsoftheparallelandfault-tolerantprocessingareencapsulated,whichmakesprogrammingveryeasytoimplement.
MapReduceparallelcalculationisdividedintotwoparts,thefirststepisinitializingtheoriginalinputdatafileandthedatasetisdividedintoapluralityofacertainsizeofdatablock,facilitateparallelcomputing;thesecondstepistostartthemapandreducefunctionsalgorithmofparallelcomputing,finallyproducedthefinalresult.
Figure1ParallelflowchartofMapReduceKeytechnologyresearchandImplementation1.
ThebasicideaofthetraditionalcollaborativefilteringalgorithmbasedonItem-basedThetraditionalbasedonitemsofcollaborativefilteringalgorithmthebasicideaisdividedintothreeparts,thefirstpartistocomputethesimilaritybetweenitems,commonsimilaritycalculationmethodwithcosinesimilarity,Pearsoncorrelationcoefficient,Tanmotocoefficientcorrelationof.
ThispaperselectstheEuclideansimilarityalgorithm,asfollows:TheassumptionisthatthereisavectorXandavectorY:X=(1x,2x,3x),Y=(1y,2y,3y),UsingtheEuclideansimilarityalgorithmtocalculatethesimilaritybetweenXandYSvector(x,y)formulaisasfollows[7]:1(,)1(,)Sxydxy=+(1)Where(,)dxyisthedistancebetweenthevectorXandY,thecalculationformulaisasfollows:222231123(dxyxyyyxx2)Thesecondpartistocalculatetheuserratingsmatrixontheitemsofthegoodsaccordingtothesimilaritymatrix;thethirdpartistheitemsimilaritymatrixWandtheusersoftheitemscorematrixmultiplicationtoobtaintherecommendationresults.
TraditionalItem-Basedcollaborativefilteringrecommendationalgorithmbasedonitemisthestagethataffectstheperformanceofthealgorithm.
Ifthenumberofusersisn,thenumberofcommodityitemsism,thetimecomplexityoffindingalltheitemsinthenprojectisO(2m),thetotalsearchspaceisnusers,sothetimecomplexityofcomputingsimilarityisO(2nm).
Sowhencalculatingthesimilaritymatrixofitems,itisindependentofthesimilaritybetweenthecalculatedandtheotherpairofitemstoaproject,soitispossibletocalculatethesimilaritymatrix.
2.
AnewalgorithmofItem-basedbasedonMapReduceThenewalgorithmismainlydividedintotwosteps;thefirststepistheMapReduceimplementationofK-Meansalgorithmbasedonclusteringofusers.
ThesecondstepistoachievetheparallelrecommendationalgorithmofItem-basedonMapReduce,theproductofuserclusteringrecommendation.
2.
1ThenewalgorithmK-MeansbasedonMapReduceThebasicideaofthetraditionalK-meansclusteringalgorithm:fromMdataobjectsinarbitrarychoiceofKobjectsastheinitialclustercenters;fortherestoftheotherobjects,accordingtotheirdistanceandtheclustercenters,respectively,theyallocatedtoitsmostsimilarclustering;thencalculateeachreceivedanewclusteringalgorithmclusteringcenter;keeprepeatingtheprocessuntilnochangesinacore.
Inthek-meansalgorithmtocalculatethedistancebetweendataobjectsandclustercentersisthemosttime-consumingoperation.
ThedataobjectandKclustercenterdistancecomparisonatthesametime,datafromotherobjectscanalsobecomparedwiththeKdistanceofthecenterofcluster,sotheoperationcanbeparallelized[8]BasedonMapReduceparallelimplementationofK-meansalgorithmcanimprovethespeedoftheclusteringalgorithm,isdividedintothreesteps:thefirststep:themapfunction,foreverypointcalculationrecentlythecenterdistanceandthecorrespondingtothenearestclustercenter.
Thesecondstep:Combinefunction,justcompletedtheMapmachineonthemachinearecompletedwiththesamepointoftheclusterpointofsummation,reducetheamountofcommunicationandcomputationofReduceoperation.
ThisstepisthekeytotheuseofCombinefunctiononthemachineonthefirstofthesameclustermerge,reducedtotheReducefunctionofthetransferandtheamountofcomputation.
Thethirdstep:theReducefunction,theintermediatedataofeachclustercenterwillbeformedandthenewclustercentercanbeobtained.
Eachiterationisrepeatedonthethreestep.
Figure2ParallelFlowChartofK-meansAlgorithmbasedonMapReduce2.
2thecollaborativefilteringalgorithmbasedonMapReduceforparallelimplementationofItem-basedBasedonthesimilaritycalculationformulamentionedabove(1),thispaperpresentsacollaborativefilteringrecommendationalgorithmbasedonMapReduce.
Algorithm1ThecollaborativefilteringrecommendationalgorithmbasedonMapReduceINPUT:Userinformationfile,Iteminformationfile,IntendeduserOUTPUT:IntendeduserrecommendedlistTheprocessisasfollows:Step1:Transformingtheuservectorintoanitemvector;Step2:Parallelcalculationofthesimilaritybetweenitems;thecalculationofthesimilaritybetweenitemsaccordingtotheformula(2)tocalculate;Step3:Similaritymatrixofparallelcomputingobjects;Step4:Parallelcomputinguserratingmatrix;inthecalculationoftheuser'sscoringmatrix,iftheuserisnotontheitemstoomuch,thenthedefaultscoreis1;Step5:Theresultsobtainedbythemultiplicationofthesimilaritymatrixofparallelcomputingobjectsandtheuser'sscorematrixarerecommended.
Experimentalresultanalysis1.
experimentalenvironmentThesimulationexperimentusingVMware_Workstation_10.
0.
3,virtualizationsoftwaretovirtualHadoopcloudplatform.
EightvirtualmachinesareinstalledonthevirtualHadoopcloudplatform,andaHadoopclusterenvironmentisbuiltontheseeightvirtualmachines.
OneofthevirtualmachineasagoodJobTrackernodeNameNode,theothersevenvirtualmachinesdeployedTaskTrackerandDataNode.
Thesemachinesareinthesamelocalareanetwork.
Theexperimentuseseightsetsofvirtualmachinehardwareconfigurationandsoftwareconfigurationasshownintable1:Table1HadoopClusterConfigurationOSCentos6.
4JDKVersion1.
6.
0Hadoop1.
1.
2HardWare2GRAM100GHardDisk2.
ExperimentandanalysisBasedonMapReduceparallelimplementationofItem-basedcollaborativefilteringalgorithminparallelmodeexpansionrateperformancecomparisontest,selectthesizeofthedataset,respectively,intheefficiencyof1-8nodesrunning.
Theexperimentalresultsareshownbelow:Figure3PerformanceTestChartFigure3isbasedonMapReduceparallelimplementationofitembasedcollaborativefilteringalgorithmcantestchart,theXaxisisthenumberofclients,they-axisistheresponsetimeofthesystem.
TheexperimentalresultsshowthatbasedonMapReduceparallelimplementationofitembasedcollaborativefilteringalgorithmperformancecomparedtothetraditionalrecommendationalgorithmissignificantlyimproved.
ConclusionInthispaper,anewalgorithmofcollaborativefilteringalgorithmbasedonMapReduceisproposed.
Theexperimentresultsshowthatthenewalgorithmhashighefficiencyandcanachievehighperformanceatalowcost.
Butinthispaper,theuserclusteringiscompletedonthebasisoftheuserwithasmallnumberofattributes,forhighdimensionalattributesoftheusergroups,butalsotodofurtherresearch.
Inadditiontothenewalgorithminthispaperhasbeenputforward,wewillcontinuetoimprovetheexperimentalmethod,andconstantlyimprovetheaccuracyoftherecommendationalgorithm.
References[1]Chenruming,Challenges,valuesandcopingstrategiesintheeraofbigdata[J].
MobileCommunications.
2012(17):14-15.
[2]SunLingfang,ZhangJing.
ElectronicrecommendationmechanismbasedonRFMmodelandcollaborativefiltering[J].
JournalofJiangsuUniversityofScienceandTechnology(NaturalScienceEdition).
2010,24(3):285-289.
[3]LIGai,PANRong.
etCollaborativefilteringalgorithmparallelizeresearchbasedonlargedatasetsa[J].
ComputerEngineeringandDesign,2012,33(6):2437-2441.
[4]LIWenhai;XUShuren;DesignandimplementationofrecommendationsystemforE-commerceonHadoop[J].
ComputerEngineeringandDesign,2014(35):131-136.
[5]SUNTianhao,LIAnnenget.
ResearchonDistributedCollaborativeFilteringRecommendationAlgorithmBasedonHadoop[J].
ComputerEngineeringandApplications,2014,51(15):124:128[6]XieXuelian,LiLanyou.
ResearchonParallelK-meansAlgorithmBasedonCloundComputingPlatform[J].
ComputerMeasurement&Control,2014,22(5):1510-1512.
[7]YanCun,JiGenlin.
DesignandImplementationofItem-BasedParallelCollaborativeFilteringAlgorithm[J].
JOURNALOFNANJINGNORMALUNIVERSITY(NaturalScienceEdition),2014,37(1):71-75.
[8]WAGNFei,QinXiaolin.
Algorithmfork-meansBasedonDataStreaminCloudComputing[J].
ComputerScience,2015,42(11):235:239.

无忧云( 9.9元/首月),河南洛阳BGP 2核 2G,大连BGP线路 20G高防 ,

无忧云怎么样?无忧云服务器好不好?无忧云值不值得购买?无忧云,无忧云是一家成立于2017年的老牌商家旗下的服务器销售品牌,现由深圳市云上无忧网络科技有限公司运营,是正规持证IDC/ISP/IRCS商家,自营有国内雅安高防、洛阳BGP企业线路、香港CN2线路、国外服务器产品等,非常适合需要稳定的线路的用户,如游戏、企业建站业务需求和各种负载较高的项目,同时还有自营的高性能、高配置的BGP线路高防物理...

月神科技-美国CERA 5折半价倒计时,上新华中100G高防云59起!

官方网站:点击访问月神科技官网优惠码:美国优惠方案:CPU:E5-2696V2,机房:国人热衷的优质 CeraNetworks机房,优惠码:3wuZD43F 【过期时间:5.31,季付年付均可用】活动方案:1、美国机房:洛杉矶CN2-GIA,100%高性能核心:2核CPU内存:2GB硬盘:50GB流量:Unmilited端口:10Mbps架构:KVM折后价:15元/月、150元/年传送:购买链接洛...

Hostiger发布哥伦布日提供VPS主机首月七折优惠 月费2.79美元

Hostiger商家我们可能以前也是有见过的,以前他们的域名是Hostigger,后来进行微调后包装成现在的。而且推出Columbus Day哥伦布日优惠活动,提供全场的VPS主机首月7折月付2.79美元起的优惠。这里我们普及一下基础知识,Columbus Day ,即为每年10月12日,是一些美洲国家的节日,纪念克里斯托弗·哥伦布在北美登陆,为美国的联邦假日。Hostiger 商家是一个成立于2...

centos6.0为你推荐
有机zz怎么看不了呢youj1zz不能看还有什么网站vc组合VC 组合框 禁用 破解杰景新特萨克斯吉普特500是台湾原产的吗网页源代码网页源代码是什么,具体讲一下?www.5566.com.cn大家在哪里在线看动漫?邯郸纠风网邯郸媒体曝光电话多少猴山条约游猴山,观猴子猴山条约尼布楚条约,是我们割地,为什么说是公平条约呢莱姿蔓不蔓不枝的蔓是什么意思酒仙琐事酒鬼变酒仙诗词
精品网 rackspace 163网 iis安装教程 seovip 好看的桌面背景大图 web服务器架设软件 浙江独立 dux adroit cn3 免费测手机号 免费智能解析 卡巴斯基破解版 服务器硬件防火墙 安徽双线服务器 阿里云邮箱登陆地址 广州服务器托管 七十九刀 石家庄服务器 更多