Intel:AcceleratingthePathtoExascaleKirkSkaugenVicePresidentIntelArchitectureGroupGeneralManagerDataCenterGroupAnInsatiableNeedForComputingExascaleProblemsCannotBeSolvedUsingtheComputingPowerAvailableToday10PFlops1PFlops100TFlops10TFlops1TFlops100GFlops10GFlops1GFlops100MFlops100PFlops10EFlops1EFlops100EFlops1993201719992005201120231ZFlops2029WeatherPredictionMedicalImagingGenomicsResearchSource:www.
top500.
orgForecastExascaleAnswersMankind'sChallengesIn…Weather/ClimateHealthcareNewFormsofEnergyWe'veHelpedTransformIndustries~1TFLOP~$55K/GFLOP500TFLOPSPerformance$/GFLOPAnnualServerProcessorShipmentsSupercomputingin1997Supercomputingin201019952000200020052010201519952000200520101995IntelCommitmentToExascaleProgrammingParallelismEfficientPerformanceExtremeScalabilityIntelExascaleCommitment:>100XPerformanceOfTodayAtOnly2XThePowerofToday's#1SystemScalingToday'sSoftwareModel6ExascaleRequirementsPetascaleMachineof2010:TFLOPofComputeEstimationbasedonPetascalemachinerequirementscirca2010.
Compute40xMemory75XComms20xDisk/Storage33xOther900xVisceralFocusonSystemPowerEfficiencyImprovementScalingProgrammabilityOneProgrammingModelDemocratizesUsage…AvoidCostlyDetours2003200520072009201190nm65nm45nm32nm22nmInventedSiGeStrainedSilicon2ndGen.
SiGeStrainedSilicon2ndGen.
Gate-LastHigh-kMetalGateInventedGate-LastHigh-kMetalGateFirsttoImplementTri-GateSTRAINEDSILICONHIGH-kMETALGATETRI-GATE22nmARevolutionaryLeapinProcessTechnology37%PerformanceGainatLowVoltage*>50%ActivePowerReductionatConstantPerformance*ProcessTechnologyLeadershipThefoundationforallcomputingSource:Intel*ComparedtoIntel32nmTechnologyIntelLabs&HPCStrongResearchPartnershipsUniversitiesGovernmentIndustryWorldClassResearchinHPC*Othernames,logosandbrandsmaybeclaimedasthepropertyofothers.
DeliveringBreakthroughTechnologiestoFuelInnovationPowerful.
Intelligent.
EfficientI/OIntegratedPCIereduceslatencyandpowerGrowingPerformanceUpto8corespersocket2XFLOPSwithIntelAdvancedVectorExtensionsContinuingTheJourney:NextIntelXeonProcessorCodenamedSandyBridge-EPTheFoundationoftheInnovationinScienceandTechnologyHighlyParallelPerformanceIntelManyIntegratedCore(IntelMIC)ArchitectureLaunchingon22nmwith>50corestoprovideoutstandingperformanceforHPCusersThemanybenefitsofbroadIntelCPUprogrammingmodels,techniques,andfamiliarx86developertoolsDeliveredPerformanceThecomputedensityassociatedwithspecialtyacceleratorsforparallelworkloadsAStepForwardInDealingWithEfficientPerformance&ProgrammabilityProgrammabilityPerformanceDensity13EvaluatingtheIntelMICArchitectureArndtBodeLeibnizSupercomputingCentre,GermanywithinputfromIrisChristadler,AlexanderHeineckeandVolkerWeinbergJune2011,ISC,HamburgEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011PrefaceProgrammingmodelsarethekeytoharnessthecomputationalpowerofmassivelyparalleldevices.
Obviously,Intelhasrealizedthistrendandsubstantiallysupportsopenstandardsandinvestsininnovativeprogrammingmodels.
LRZandTUMareusingIntelhard-andsoftwareformanyyearsandknowthetoolchainbyheart.
Weexpect:Ahardwareproductthatdeliversgoodperformance(andenergy-efficiency)withoutloosingprogrammability.
14EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
WritingMIC-acceleratedcodewithminimaleffortandgreatperformance15EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011WorkloadsunderInvestigationEurobenKernels(7dwarfsofHPC)DataMiningTifaMMy–MatrixOperations(DemohereatISC'11!
)FurtherLinearAlgebraandSimulationCodes16EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011EurobenKernelsSelectedmicro-benchmarksusedinPRACEfortheevaluationofacceleratorhardware&newlanguages:http://www.
prace-project.
eu/documents/public-deliverables/d6-6.
pdf–Example:mod2am:densematrix-matrixmultiplication(MxM)17Performanceevaluationofmod2amonKNFwith30cores@1050MHzusingIntel'sOffloadCompiler,singleprecision,datatransfertimesexcludedEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011DataMiningwithAdaptiveSparseGridsMachinelearningalgorithmLearningfunctionfromatrainingdatasetImportantworkloadforclassificationandregressionofhugedatasetsMIC-Execution:StraightforwardFirstversionwithinafewhoursOptimizedversiontook2days150420050100150200250300350400450WSM-EPX5670KNF32/1200(incl.
offload)GFlops/s18Testworkload:Learning5dcheckerboardwith262144instancesandclassificationaccuracyof92%EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–IdeaandApplicationTifaMMy:self-adaptiveandcache-obliviousframeworkformatrixoperationsoptimizedonfatx86coresThisisdonebynestedrecursionsandvectorizedkernels–OnMIConlythekernelswerechanged,MIC'sx86coresareabletotacklenestedrecursions!
parallelizationschemeemployingOpenMPcanbereusedhavingSSEkernels,bringingcodetoMICisnearlyforfree19EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–PerformanceMatrixMultiplication20010020030040050060070032256480704928115213761600182420482272249627202944316833923616384040644288451247364960518454085632585660806304652867526976720074247648MatrixSizeGFLOPSMaxTestworkload:TifaMMyExecutedonKNFwith32cores@1200MHzEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
Pre-releaseMIC-acceleratedcodeforatypicalscientificworkload(e.
g.
DataMining,TifaMMy)canreachupto50%ofpeakperformance!
VisitdemohereatISC'11!
21"SGIunderstandsthesignificanceofinter-processorcommunications,power,densityandusabilitywhenarchitectingforexascale.
IntelhasmadetheleaptowardsexaflopcomputingwiththeintroductionofIntelManyIntegratedCore(MIC)architecture.
FutureIntelMICproductswillsatisfyallfourofthesepriorities,especiallywiththeirexpectedtentimesincreaseincomputedensitycoupledwiththeirfamiliarX86programmingenvironment.
"Dr.
EngLimGoh,SGICTO23IntelMICArchitecture:NeededforExascaleExaflopby2018125xcomputepower25x:Moore'sLaw5x:remains24IntelMICArchitecture:Familiarx86Programming#include#include#defineN1000000000LLmain(){doublepi=0.
0f;longi;#pragmaoffloadtarget(mic)#pragmaompparallelforreduction(+:pi)for(i=0;i100XPerformanceOfTodayAtOnly2XThePowerOfToday's#1ScalingToday'sSoftwareModel30SystemConfiguration7TFLOPSSGEMMinanodeHWspecifications8xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostColfaxCXT8000:2socketplatformwith2IntelXeonprocessorX5690(3.
46GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,DualIntel5520IOH,OSRHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):ComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–ColfaxModel:CXT8000Serverw/Intel5520chipsetand4PLXPEX8647Gen2PCIeswitches–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)31SystemConfigurationHybridComputingwithIntelMKLHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,IntelMKL,driversetc.
)SWspecificationsMKL4KNFMKLKNF.
b2build20110518MKL10.
3.
332SystemConfigurationHybridComputingLUFactorizationHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)33SystemConfigurationKISTIMolecularDynamicsHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostDellPrecisionWorkstation1socketplatformwith1IntelXeonprocessorX5620(4cores,2.
4GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–DellPrecisionWorkstation–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)34SystemConfigurationCERNopenlab:CoreScalingofIntelMICArchitectureHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostSGIH40022socketplatformwith2IntelXeonprocessorX5690(6cores,3.
46GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–SGIH4002System–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)35SystemConfigurationLRZ:TifaMMyMatrixMultiplicationHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)36SystemConfigurationFZJülich:SMMPProteinFoldingHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)
搬瓦工最新优惠码优惠码:BWH3HYATVBJW,节约6.58%,全场通用!搬瓦工关闭香港 PCCW 机房通知下面提炼一下邮件的关键信息,原文在最后面。香港 CN2 GIA 机房自从 2020 年上线以来,网络性能大幅提升,所有新订单都默认部署在香港 CN2 GIA 机房;目前可以免费迁移到香港 CN2 GIA 机房,在 KiwiVM 控制面板选择 HKHK_8 机房进行迁移即可,迁移会改变 IP...
今天9月10日是教师节,我们今天有没有让孩子带礼物和花送给老师?我们这边不允许带礼物进学校,直接有校长在门口遇到有带礼物的直接拦截下来。今天有看到Friendhosting最近推出了教师节优惠,VPS全场45折,全球多机房可选,有需要的可以看看。Friendhosting是一家成立于2009年的保加利亚主机商,主要提供销售VPS和独立服务器出租业务,数据中心分布在:荷兰、保加利亚、立陶宛、捷克、乌...
在2014年发现原来使用VPS的客户需求慢慢的在改版,VPS已经不能满足客户的需求。我们开始代理机房的独立服务器,主推和HS机房的独立服务器。经过一年多的发展,我们发现代理的服务器配置参差不齐,机房的售后服务也无法完全跟上,导致了很多问题发生,对使用体验带来了很多的不便,很多客户离开了我们。经过我们慎重的考虑和客户的建议。我们在2015开始了重大的改变, 2015年,我们开始计划托管自己...
www.6080.org为你推荐
云爆发云出十里未及孤村什么意思Baby被问婚变绯闻小s在黄晓明婚礼上问了什么问题今日油条油条的由来及历史商标注册流程及费用申请商标的流程和花费及时间是什么刘祚天DJ是什么职业?地陷裂口地陷是由什么原因引起的xyq.163.cbg.com梦幻西游藏宝阁haokandianyingwang有什么好看的电影网站www.sesehu.comwww.121gao.com 是谁的网站啊杨丽晓博客杨丽晓今年高考了吗?
域名批量查询 泛域名解析 免费cn域名 踢楼 分销主机 2014年感恩节 天猫双十一抢红包 标准机柜尺寸 500m空间 100m免费空间 腾讯云分析 柚子舍官网 万网空间购买 网站在线扫描 视频服务器是什么 西安主机 贵阳电信测速 lamp是什么意思 酸酸乳 114dns 更多