sufficientopteron
opteron 时间:2021-03-27 阅读:(
)
MellanoxTechnologiesInc.
2900StenderWay,SantaClara,CA95054Tel:408-970-3400Fax:408-970-3403http://www.
mellanox.
comRealApplicationPerformanceandBeyondWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
2Scientists,engineersandanalystsinvirtuallyeveryfieldareturningtohighperformancecomputingtosolvetoday'svitalandcomplexproblems.
Simulationsareincreasinglyreplacingexpensivephysicaltesting,asmorecomplexenvironmentscanbemodeledandinsomecases,fullysimulated.
High-performancecomputingencompassesadvancedcomputationoverparallelprocessing,enablingfasterexecutionofhighlycomputeintensivetaskssuchasclimateresearch,molecularmodeling,physicalsimulations,cryptanalysis,geophysicalmodeling,automotiveandaerospacedesign,financialmodeling,dataminingandmore.
HPCclustershavebecomethemostcommonbuildingblocksforhigh-performancecomputing,notonlybecausetheyareaffordable,butbecausetheyprovidetheneededflexibilityanddeliversuperiorprice/performancecomparedtoproprietarysymmetricmultiprocessing(SMP)systems,withthesimplicityandvalueofindustrystandardcomputing.
MauiHighPerformanceComputingCenter1280servers,MellanoxInfiniBandinterconnect,42.
3TFlopsReal-worldapplicationperformancedependsontheperformanceofthevariouscluster'skeyelements–theprocessor,thememory,andtheinterconnect.
Theinterconnectcontrolsthedatatransferbetweenservers,andhasahighinfluenceontheCPUefficiencyandmemoryutilization.
Transportoffloadinterconnectarchitectures,unlikethe"on-loading"ones,eliminatetheneedofdealingwiththeprotocolprocessingwithintheCPUandthereforeincreasethenumberofcyclesavailableforcomputationaltasks.
IftheCPUisbusymovingdataandhandlingnetworkprotocolprocessing,itisunabletoperformcomputationalwork,andtheoverallproductivityofthesystemisseverelydegraded.
Thememorycopyoverheadincludestheresourcesrequiredtocopydatabuffersfromthenetworkdevicetothekernelmemoryandthenfromthekernelmemorytotheapplicationmemory.
Thisapproachrequiresmultiplememoryaccessesbeforethedataisplacedinitsfinaldestination.
Whileitisnotamajorproblemforsmalldatatransfers,itisabigproblemforlargerdatatransfers.
Thisiswheretheinterconnectzero-copycapabilitieseliminatesthememorybandwidthbottleneckwithoutinvolvingtheCPUinthenetworkdatatransfer.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
3SandiaNationalLab4500servers,MellanoxInfiniBandinterconnect53TFlops,84.
66%LinpackefficiencyTheinterconnectbandwidthandlatencyhavetraditionallybeenusedastwometricsforassessingtheperformanceofthesystem'sinterconnectfabric.
However,thesetwometricsaretypicallynotsufficienttodeterminetheperformanceofrealworldapplications.
Typicalreal-worldapplicationssendmessagesrangingfrom64Byteto4Megabyteusingnotonlypoint-to-pointcommunicationbutadiversemixtureofcommunicationpatterns,includingcollectiveandreductionpatternsinthecaseofMPI.
Insomecases,interconnectvendorscreateartificialbenchmarks,suchasmessagerate,andapplybombasticmarketingsloganstothesebenchmarks–suchas"Hypermessaging".
Messagerateisyetanothersinglepointinthepoint-to-pointbandwidthgraph.
Ifthetraditionalinterconnectbandwidthindicatesthemaximumavailablebandwidth(singlepoint),messagerateindicatesthebandwidthformessagesizeofzeroor2bytes.
Thesinglepointsofdata,givesomeindicationfortheinterconnectperformance,butarefarfromdescribingtherealworldapplicationperformance.
Theinteractivecombinationofthosepoints,togetherwithothers(CPUoverhead,zerocopyetc.
),willdeterminetheoverallabilityoftheconnectivitysolution.
Thedifferencebetweentheoreticalpowerandwhatisactuallydeliveredismeasuredasprocessorefficiency.
ThemoreCPUcyclesusedtogetthedataoutthedoorby"fillingthewire"duetoprotocolanddatatransferinefficiencies,thelesscyclesareavailablefortheapplication.
Whencomparinglatenciesofdifferentinterconnects,oneneedstopayattentiontotheinterconnectarchitecture.
1useclatency"on-loading"interconnectversus2useclatency"off-load"solutionissimilartoacasewhenoneneedstodecidebetweentwocarsthatshowthesamehorsepower(i.
e.
CPU).
Bothenginesarecapableof200milesperhour,butthefirstcar,dueto"on-loading",limitstheactualenginepowerto75milesperhour(theenginepowermustbeusedforothertasks).
TheSecondcarhasnolimitationsontheengine,butitswheelscantolerateonly150milesWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
4perhour.
Theknowledgeonthewheelstolerance(i.
e.
latency),asasinglepointofdata,isdefinitelymisleading.
Thereareattemptstoproviderealworldapplicationperformancewhilecomparingdifferentinterconnects,butinmostcasesthe"comparison"isbiasedandbyusingdifferentsystemsand/orconditions,whichmakesatruecomparisondifficult.
Therehavebeenrecentcasescomparing10-GigabitEthernettoInfiniBand.
WhileInfiniBandadaptersweretestedwithPCIex4(thatislimitedto~700MByte/secbandwidth(duetolimitationsincurrentavailablesystems),the10GigabitEthernetcardswerePCI-X,thatiscapabletohigherbandwidth(~850-900MByte/s).
OthercasescompareInfiniBandPCIex4tootherinterconnectswithPCIex8hostinterface(theonlyvalidconclusiononecanmakeisthatPCIex8hasmorelanesthanPCIex4).
AnotherpapercomparedQLogicInfiniPathonIntel3GHzCPUbasedsystemtoMellanoxInfiniBandon2.
2GHzOpteronbasedsystem.
Anyattempttocomparedifferentinterconnectsinthosemannersisdeceptive.
RealapplicationperformanceInfiniBandisaproveninterconnectforclusteredserversolutions,andoneoftheleadingconnectivitysolutionforhigh-performancecomputing.
InfiniBandwasdesignedasageneralI/Oandinpracticeprovideslow-latencyandthehighestlinkspeed.
ComputationalFluidDynamics(CFD)isoneofthebranchesoffluidmechanicsthatusesnumericalmethodsandalgorithmstosolveandanalyzeproblemsthatinvolvefluidflows.
ANSYS/FLUENTisaleadingcommercialsoftwareproviderforsolvingfluidflowproblems.
ThebroadphysicalmodelingcapabilitiesofFLUENThavebeenappliedtoindustrialapplicationsrangingfromairflowoveranaircraftwingtocombustioninafurnace,frombubblecolumnstoglassproduction,frombloodflowtosemiconductormanufacturing,fromcleanroomdesigntowastewatertreatmentplants.
Theabilityofthesoftwaretomodelin-cylinderengines,aeroacoustics,turbomachinery,andmultiphasesystemshasservedtobroadenitsreach.
AtthecoreofanyCFDcalculationisacomputationalgrid,usedtodividethesolutiondomainintothousandsormillionsofelementswheretheproblemvariablesarecomputedandstored.
InFLUENT,unstructuredgridtechnologyisused,whichmeansthatthegridcanconsistofelementsinavarietyofshapes:quadrilateralsandtrianglesfor2Dsimulations,andhexahedral,tetrahedral,prisms,andpyramidsfor3Dsimulations.
Theseelementsformaninterlockingnetworkthroughoutthevolumewherethefluidflowanalysistakesplace.
TheperformanceofaCFDcodedependsonseveralfactors,includingsizeandtopologyofthemesh,physicalmodels,numericsandparallelization,compilersandoptimization,inadditiontoperformancecharacteristicsofthehardwarewherethesimulationisperformed.
FLUENTprovidesasetofbenchmarkproblemswhichrepresenttypicalcurrentusageandcoveringawiderangeofmeshsizesandphysicalmodels.
Theproblemsselectedrepresentarangeofsimulationstypicalofthosewhichmightbefoundinindustry.
TheprincipalobjectiveofthisbenchmarksuiteistoprovidecomprehensiveandfaircomparativeinformationoftheperformanceofFLUENTonavailablehardwareplatforms.
ThefollowingchartscomparesMellanoxInfiniBandandQLogicInfiniPathinterconnectsonthesameplatform–dualcore,dualsocket,IntelXeon3GHz5100series(codenameWoodcrest)servers,usingFLUENTbenchmarks.
Whentestingrealworldapplications,theentirearchitecturemakesthedifference.
TheMellanoxarchitectureisafulltransport-offloadone,withhardwarecapabilitiesofRDMA,whileQLogicisafull"on-loading"architecture.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
5InFluentFL5L3benchmark,aTurbulentflowofairthroughaductiscomputed.
Thecross-sectionalplanesoftheducttransitionfromacircleattheinlettoarectangleattheoutflowboundary.
TheReynolds-StressModelisusedforcomputingturbulence(numberofcells:9,792,512,celltypehexahedral,modelsRSMturbulence,solversegregatedimplicit).
FLUENTFL5L2benchmarkrepresentsthecomputationoftheexteriorflowfieldaroundasimplifiedmodelofapassengersedan.
ThesimulationgeometrywasusedfortheJapanExternalAerodynamicscompetition.
Aviscous-hybridgridwithprismaticcellsisusedtoadequatelyFluent6.
3,FL5L3case0200400600800100012001400160018002000020406080100120140CPUcoresRating(performance)QlogicMellanoxFluent6.
3,FL5L2case02000400060008000020406080CPUcoresRating(performance)QlogicMellanoxWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
6modeltheboundarylayerregions(numberofcells3,618,080,celltypehybrid,modelsk-epsilonturbulence,solversegregatedimplicit).
ChoosingtherightinterconnectInbothcasesofFLUENTbenchmarks,MellanoxInfiniBandshowshigherperformanceandbettersuper-linearscalingcomparingtoQLogicInfiniPath.
FLUENT'sCFDapplicationisalatency-sensitiveapplication,andtheresultsshownherearegoodexamplesonhowpurelatencybenchmarkscanbemisleadingwhenchoosingtherightinterconnect.
Inordertodeterminethesystem'sperformance,oneshouldtakeintoconsiderationtheentireinterconnectarchitecture(suchasoff-loadingversuson-loading)andtheabilityofscaling,ratherthanjustsinglepointsofdata.
Inordertoprovidebetterapplicationssight,MellanoxhascreatedtheMellanoxClusterCenter.
TheMellanoxClusterCenteroffersanenvironmentfordeveloping,testing,benchmarkingandoptimizingproductsbasedonInfiniBandtechnology.
Thecenter,locatedinSantaClara,California,provideson-sitetechnicalsupportandenablessecuresessionsonsiteorremotely.
MoredetailscanbeachievedthroughMellanoxwebsite.
官方网站:点击访问创梦网络宿迁BGP高防活动方案:机房CPU内存硬盘带宽IP防护流量原价活动价开通方式宿迁BGP4vCPU4G40G+50G20Mbps1个100G不限流量299元/月 209.3元/月点击自助购买成都电信优化线路8vCPU8G40G+50G20Mbps1个100G不限流量399元/月 279.3元/月点击自助购买成都电信优化线路8vCPU16G40G+50G2...
弘速云怎么样?弘速云是创建于2021年的品牌,运营该品牌的公司HOSU LIMITED(中文名称弘速科技有限公司)公司成立于2021年国内公司注册于2019年。HOSU LIMITED主要从事出售香港vps、美国VPS、香港独立服务器、香港站群服务器等,目前在售VPS线路有CN2+BGP、CN2 GIA,该公司旗下产品均采用KVM虚拟化架构。可联系商家代安装iso系统,目前推出全场vps新开7折,...
提速啦简单介绍下提速啦 是成立于2012年的IDC老兵 长期以来是很多入门级IDC用户的必选商家 便宜 稳定 廉价 是你创业分销的不二之选,目前市场上很多的商家都是从提速啦拿货然后去分销的。提速啦最新物理机活动 爆炸便宜的香港CN2物理服务器 和 日本CN2物理服务器香港CTG E5 2650 16G内存 20M CN2带宽 1T硬盘 150元/月日本CN2 E5 2650 16G内存 20M C...
opteron为你推荐
站酷zcool站酷zcool字体下载后怎么安装到PS中22zizi.comwww 地址 didi22怎么打不开了,还有好看的吗>comxyq.163.cbg.com『梦幻西游』那藏宝阁怎么登录?钟神发跪求钟神发名言出处,A站大神看过来haole018.comse.haole004.com为什么手机不能放?www.544qq.COM跪求:天时达T092怎么下载QQ百度指数词什么是百度指数avtt4.comwww.5c5c.com怎么进入baqizi.cc汉字的故事100字33tutu.comDnf绝望100鬼泣怎么过
3322动态域名注册 联通vps 已经备案域名 google电话 阿里云os 冰山互联 韩国电信 主机合租 华为4核 个人空间申请 ca4249 什么是刀片服务器 秒杀预告 服务器维护方案 ntfs格式分区 广州服务器 百度云1t 美国独立日 lamp什么意思 成都主机托管 更多