Intel:AcceleratingthePathtoExascaleKirkSkaugenVicePresidentIntelArchitectureGroupGeneralManagerDataCenterGroupAnInsatiableNeedForComputingExascaleProblemsCannotBeSolvedUsingtheComputingPowerAvailableToday10PFlops1PFlops100TFlops10TFlops1TFlops100GFlops10GFlops1GFlops100MFlops100PFlops10EFlops1EFlops100EFlops1993201719992005201120231ZFlops2029WeatherPredictionMedicalImagingGenomicsResearchSource:www.
top500.
orgForecastExascaleAnswersMankind'sChallengesIn…Weather/ClimateHealthcareNewFormsofEnergyWe'veHelpedTransformIndustries~1TFLOP~$55K/GFLOP500TFLOPSPerformance$/GFLOPAnnualServerProcessorShipmentsSupercomputingin1997Supercomputingin201019952000200020052010201519952000200520101995IntelCommitmentToExascaleProgrammingParallelismEfficientPerformanceExtremeScalabilityIntelExascaleCommitment:>100XPerformanceOfTodayAtOnly2XThePowerofToday's#1SystemScalingToday'sSoftwareModel6ExascaleRequirementsPetascaleMachineof2010:TFLOPofComputeEstimationbasedonPetascalemachinerequirementscirca2010.
Compute40xMemory75XComms20xDisk/Storage33xOther900xVisceralFocusonSystemPowerEfficiencyImprovementScalingProgrammabilityOneProgrammingModelDemocratizesUsage…AvoidCostlyDetours2003200520072009201190nm65nm45nm32nm22nmInventedSiGeStrainedSilicon2ndGen.
SiGeStrainedSilicon2ndGen.
Gate-LastHigh-kMetalGateInventedGate-LastHigh-kMetalGateFirsttoImplementTri-GateSTRAINEDSILICONHIGH-kMETALGATETRI-GATE22nmARevolutionaryLeapinProcessTechnology37%PerformanceGainatLowVoltage*>50%ActivePowerReductionatConstantPerformance*ProcessTechnologyLeadershipThefoundationforallcomputingSource:Intel*ComparedtoIntel32nmTechnologyIntelLabs&HPCStrongResearchPartnershipsUniversitiesGovernmentIndustryWorldClassResearchinHPC*Othernames,logosandbrandsmaybeclaimedasthepropertyofothers.
DeliveringBreakthroughTechnologiestoFuelInnovationPowerful.
Intelligent.
EfficientI/OIntegratedPCIereduceslatencyandpowerGrowingPerformanceUpto8corespersocket2XFLOPSwithIntelAdvancedVectorExtensionsContinuingTheJourney:NextIntelXeonProcessorCodenamedSandyBridge-EPTheFoundationoftheInnovationinScienceandTechnologyHighlyParallelPerformanceIntelManyIntegratedCore(IntelMIC)ArchitectureLaunchingon22nmwith>50corestoprovideoutstandingperformanceforHPCusersThemanybenefitsofbroadIntelCPUprogrammingmodels,techniques,andfamiliarx86developertoolsDeliveredPerformanceThecomputedensityassociatedwithspecialtyacceleratorsforparallelworkloadsAStepForwardInDealingWithEfficientPerformance&ProgrammabilityProgrammabilityPerformanceDensity13EvaluatingtheIntelMICArchitectureArndtBodeLeibnizSupercomputingCentre,GermanywithinputfromIrisChristadler,AlexanderHeineckeandVolkerWeinbergJune2011,ISC,HamburgEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011PrefaceProgrammingmodelsarethekeytoharnessthecomputationalpowerofmassivelyparalleldevices.
Obviously,Intelhasrealizedthistrendandsubstantiallysupportsopenstandardsandinvestsininnovativeprogrammingmodels.
LRZandTUMareusingIntelhard-andsoftwareformanyyearsandknowthetoolchainbyheart.
Weexpect:Ahardwareproductthatdeliversgoodperformance(andenergy-efficiency)withoutloosingprogrammability.
14EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
WritingMIC-acceleratedcodewithminimaleffortandgreatperformance15EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011WorkloadsunderInvestigationEurobenKernels(7dwarfsofHPC)DataMiningTifaMMy–MatrixOperations(DemohereatISC'11!
)FurtherLinearAlgebraandSimulationCodes16EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011EurobenKernelsSelectedmicro-benchmarksusedinPRACEfortheevaluationofacceleratorhardware&newlanguages:http://www.
prace-project.
eu/documents/public-deliverables/d6-6.
pdf–Example:mod2am:densematrix-matrixmultiplication(MxM)17Performanceevaluationofmod2amonKNFwith30cores@1050MHzusingIntel'sOffloadCompiler,singleprecision,datatransfertimesexcludedEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011DataMiningwithAdaptiveSparseGridsMachinelearningalgorithmLearningfunctionfromatrainingdatasetImportantworkloadforclassificationandregressionofhugedatasetsMIC-Execution:StraightforwardFirstversionwithinafewhoursOptimizedversiontook2days150420050100150200250300350400450WSM-EPX5670KNF32/1200(incl.
offload)GFlops/s18Testworkload:Learning5dcheckerboardwith262144instancesandclassificationaccuracyof92%EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–IdeaandApplicationTifaMMy:self-adaptiveandcache-obliviousframeworkformatrixoperationsoptimizedonfatx86coresThisisdonebynestedrecursionsandvectorizedkernels–OnMIConlythekernelswerechanged,MIC'sx86coresareabletotacklenestedrecursions!
parallelizationschemeemployingOpenMPcanbereusedhavingSSEkernels,bringingcodetoMICisnearlyforfree19EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–PerformanceMatrixMultiplication20010020030040050060070032256480704928115213761600182420482272249627202944316833923616384040644288451247364960518454085632585660806304652867526976720074247648MatrixSizeGFLOPSMaxTestworkload:TifaMMyExecutedonKNFwith32cores@1200MHzEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
Pre-releaseMIC-acceleratedcodeforatypicalscientificworkload(e.
g.
DataMining,TifaMMy)canreachupto50%ofpeakperformance!
VisitdemohereatISC'11!
21"SGIunderstandsthesignificanceofinter-processorcommunications,power,densityandusabilitywhenarchitectingforexascale.
IntelhasmadetheleaptowardsexaflopcomputingwiththeintroductionofIntelManyIntegratedCore(MIC)architecture.
FutureIntelMICproductswillsatisfyallfourofthesepriorities,especiallywiththeirexpectedtentimesincreaseincomputedensitycoupledwiththeirfamiliarX86programmingenvironment.
"Dr.
EngLimGoh,SGICTO23IntelMICArchitecture:NeededforExascaleExaflopby2018125xcomputepower25x:Moore'sLaw5x:remains24IntelMICArchitecture:Familiarx86Programming#include#include#defineN1000000000LLmain(){doublepi=0.
0f;longi;#pragmaoffloadtarget(mic)#pragmaompparallelforreduction(+:pi)for(i=0;i100XPerformanceOfTodayAtOnly2XThePowerOfToday's#1ScalingToday'sSoftwareModel30SystemConfiguration7TFLOPSSGEMMinanodeHWspecifications8xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostColfaxCXT8000:2socketplatformwith2IntelXeonprocessorX5690(3.
46GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,DualIntel5520IOH,OSRHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):ComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–ColfaxModel:CXT8000Serverw/Intel5520chipsetand4PLXPEX8647Gen2PCIeswitches–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)31SystemConfigurationHybridComputingwithIntelMKLHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,IntelMKL,driversetc.
)SWspecificationsMKL4KNFMKLKNF.
b2build20110518MKL10.
3.
332SystemConfigurationHybridComputingLUFactorizationHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)33SystemConfigurationKISTIMolecularDynamicsHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostDellPrecisionWorkstation1socketplatformwith1IntelXeonprocessorX5620(4cores,2.
4GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–DellPrecisionWorkstation–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)34SystemConfigurationCERNopenlab:CoreScalingofIntelMICArchitectureHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostSGIH40022socketplatformwith2IntelXeonprocessorX5690(6cores,3.
46GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–SGIH4002System–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)35SystemConfigurationLRZ:TifaMMyMatrixMultiplicationHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)36SystemConfigurationFZJülich:SMMPProteinFoldingHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)
spinservers是一家主营国外服务器租用和Hybrid Dedicated等产品的商家,Majestic Hosting Solutions LLC旗下站点,商家数据中心包括美国达拉斯和圣何塞机房,机器一般10Gbps端口带宽,且硬件配置较高。目前,主机商针对达拉斯机房机器提供优惠码,最低款Dual E5-2630L v2+64G+1.6TB SSD月付89美元起,支持PayPal、支付宝等...
提速啦 成立于2012年,作为互联网老兵我们一直为用户提供 稳定 高速 高质量的产品。成立至今一直深受用户的喜爱 荣获 “2021年赣州安全大赛第三名” “2020创新企业入围奖” 等殊荣。目前我司在美国拥有4.6万G总内存云服务器资源,香港拥有2.2万G总内存云服务器资源,阿里云香港机房拥有8000G总内存云服务器资源,国内多地区拥有1.6万G总内存云服务器资源,绝非1 2台宿主机的小商家可比。...
进入6月,各大网络平台都开启了618促销,腾讯云目前也正在开展618云上Go活动,上海/北京/广州/成都/香港/新加坡/硅谷等多个地区云服务器及轻量服务器秒杀,最低年付95元起,参与活动的产品还包括短信包、CDN流量包、MySQL数据库、云存储(标准存储)、直播/点播流量包等等,本轮秒杀活动每天5场,一直持续到7月中旬,感兴趣的朋友可以关注本页。活动页面:https://cloud.tencent...
www.6080.org为你推荐
12306崩溃12306是不是瘫痪了?对对塔为什么不能玩天天擂台?(对对塔)西部妈妈网我爸妈在云南做非法集资了,钱肯定交了很多,我不恨她们。他们叫我明天去看,让我用心的看,,说是什么...冯媛甑冯媛甄 康熙来了巫正刚阿迪三叶草彩虹板鞋的鞋带怎么穿?详细点,最后有图解。高分求ip在线查询我要用eclipse做个ip在线查询功能,用QQwry数据库,可是我不知道怎么把这个数据库放到我的程序里面去,高手帮忙指点下,小弟在这谢谢了mole.61.com摩尔庄园RK的秘密是什么?www.03ggg.comwww.tvb33.com这里好像有中国性戏观看吧??kb123.netwww.zhmmjyw.net百度收录慢?www.diediao.com谁知道台湾的拼音怎么拼啊?有具体的对照表最好!
游戏服务器租用 edgecast locvps awardspace 2014年感恩节 英文简历模板word 嘟牛 个人空间申请 网站cdn加速 股票老左 傲盾官网 东莞服务器 cxz 日本代理ip 域名和主机 亿库 美国vpn代理 免费网站加速 godaddy域名 服务器操作系统 更多