EOI_INDUCEDivybridge

ivybridge  时间:2021-03-28  阅读:()
MessagePassingWorkloadsinKVMDavidMatlack,dmatlack@google.
com1MessagePassingWorkloadsLoopbackTCP_RRIPIandHLTDISCLAIMER:x86andIntelVT-xHaltPollingInterruptsandquestionsarewelcome!
Overview2Usually,anythingthatfrequentlyswitchesbetweenrunningandidle.
Event-drivenworkloadsMemcacheLAMPserversRedisMultithreadedworkloadsusinglowlatencywait/signalprimitivesforcoordination.
WindowsEventObjectspthread_cond_wait/pthread_cond_signalInter-processcommunicationTCP_RR(benchmark)MessagePassingWorkloads3Intuition:Workloadswhichdon'tinvolveIOvirtualizationshouldrunatnearnativeperformance.
Reality:MessagePassingWorkloadsmaynotinvolveanyIObutwillstillperformnXworsethannative.
(loopback)Memcache:2xhigherlatency.
WindowsEventObjects:3-4xhigherlatency.
MessagePassingWorkloads4MessagePassingWorkloads2.
Receive1bytefromclient.
Send1byteback.
1.
Send1bytetoserver.
3.
Receive1bytefromserver.
Microbenchmark:LoopbackTCP_RRClientandServerping-pong1-byteofdataoveranestablishedTCPconnection.
Loopback:Nonetworkingdevices(realorvirtual)involved.
Performance:Latencyofeachtransaction.
Onetransaction:(idle)(idle)(idle)ClientServer5LoopbackTCP_RRPerformance6Host:IvyBridge3.
11KernelGuest:DebianWheezyBackports(3.
16Kernel)3xhigherlatency25usslowerMessagePassingon1CPUContextSwitchMessagePassingon>1CPUInterprocessor-InterruptsWhat'sgoingonunderthehoodVMEXITsareagoodplacetostartlooking.
KVMhasbuilt-inVMEXITcountersandtimers.
perf-kvm(1)VirtualOverheadsofTCP_RR7VirtualOverheadsofTCP_RRTotalNumberofVMEXITsVMEXITs/Transaction1VCPU2VCPU1VCPU2VCPUEXTERNAL_INTERRUPT16705123710.
020.
07MSR_WRITE259917043340.
009.
58IO_INSTRUCTION17867620.
000.
00EOI_INDUCED613250.
000.
00EXCEPTION_NMI289310.
000.
00CPUID2521120.
000.
00CR_ACCESS1712720.
000.
00HLT343543930.
001.
99EPT_VIOLATION200.
000.
00PAUSE_INSTRUCTION020140.
000.
012HLTperTransaction10MSR_WRITEperTransaction8HLTsofTCP_RR2HLTCPUinstruction.
StopexecutinginstructionsonthisCPUuntilaninterruptarrives.
VCPUwishestostopexecutinginstructions.
GuestOShasdecidedthatthereisnothingtodo.
Nothingtodo==idle.
Messagepassingworkloadsswitchbetweenrunningandidle.
.
.
910MSR_WRITE"WritetoModelSpecificRegister"instructionexecutedintheguest.
8APICTimer"InitialCount"Register(MSR838)Writtentostartaper-CPUtimer.
"Startcountingdownandfireaninterruptwhenyougettozero.
"ArtifactofNOHZguestkernel.
2APICInterruptCommandRegister(MSR830)Usedtosendinterprocessor-interrupts(IPI).
Usedtodeliver"messages"betweenclient/serverprocessesrunningonseparateCPUs.
MSR_WRITEsofTCP_RR10VMEXITsofTCP_RRVMEXITSAPICTimerRegisterAPICInterruptCommandRegister(IPI)HLTclientclientidleserveridleidle1.
Send1bytetoserver.
Waitforresponse.
2.
Receive1bytefromclient.
Send1byteback.
3.
Receive1bytefromserver.
VCPU0VCPU111HLTHLTHLTIPIIPIAPICTIMERAPICTIMERAPICTIMERAPICTIMERVMEXITsofTCP_RRVMEXITSAPICTimerRegisterAPICInterruptCommandRegister(IPI)HLTclientclientidleserveridleidle1.
Send1bytetoserver.
Waitforresponse.
2.
Receive1bytefromclient.
Send1byteback.
3.
Receive1bytefromserver.
VCPU0VCPU112HLTHLTHLTIPIIPIAPICTIMERAPICTIMERAPICTIMERAPICTIMERCriticalPath8pertransaction4onthecriticalpathNOHZ(ticklessguestkernel)"Disable"scheduler-tickuponenteringidle.
"Enable"scheduler-tickuponleavingidle.
scheduler-tick==APICTimer(couldalsobeTSCDeadlineTimer)Why2writespertransitioninto/outofidlehrtimer_cancelhrtimer_startAdds3-5ustoround-triplatency.
APICTimer"InitialCount"Register13HLT:x86Instruction.
CPUstopsexecutinginstructionsuntilaninterruptarrives.
ThispartofHLTisnotonthecriticalpath!
HowitworksinKVMPlaceVCPUthreadonawaitqueue.
YieldtheCPUtoanotherthread.
HLTkvm_vcpu_block->schedule()VMEXITHLTcontextswitchtoanotherusertask,kernelthread,oridleVCPU(guest)PCPU(KVM)14kvm_sched_outSendinganIPItowakeupaHLT-edCPU.
Onthecriticalpath!
IPI+HLTWRMSR:APICInterruptCommandRegisterkvm_vcpu_kickreturnfromschedule()inkvm_vcpu_block()vmx_vcpu_runIPIISRVMEXITVMRESUMEVCPU1VCPU0(HLT-ed)guesthostkvm_sched_in*VMEXITandVMRESUMEimplementedinHardware.
time15SendinganIPItowakeupaHLT-edCPU.
Onthecriticalpath!
Sameoperationonbaremetalisentirelyimplementedinhardware.
HowmuchoverheadfromvirtualizationUnlikeAPIC_TMICT,can'tjusttimeVMEXITs.
Wecancomparewiththesameoperationonphysicalhardware.
IPI+HLT16KVMversusHardwareRing0Microbenchmark(kvm-unit-tests)1.
VCPU0:HLT.
2.
~100usdelay3.
VCPU1:A=RDTSC4.
VCPU1:SendIPIto[V]CPU0.
5.
VCPU0:B=RDTSC(firstinstructionofIPIISR).
6.
Latency=B-A7.
Repeat.
RuninKVMguestandonbare-metal.
Compare!
17VMRESUMEWRMSRkvm_vcpu_kickreturnfromschedule()inkvm_vcpu_block()vmx_vcpu_runIPIISRVMEXITVCPU1VCPU0(HLT-ed)guesthostkvm_sched_intimeKVMversusHardwareA=RDTSCB=RDTSC18Median:KVMis12xslowerPathologicalcase(witnessed):KVMis400xslowerBestcase(witnessed):KVMis11xslowerKVM:5.
7us;Hardware:0.
5usKVMversusHardwareCyclesKVMHardwareMin137001200Average15800120050%ile14900120090%ile16000130099%ile249001300Max5210001400Host:SandyBridge@2.
6GHz3.
11KernelKVMperformanceissimilaronIvyBridge(5.
6us)andHaswell(4.
9us).
19Notesaboutthisbenchmark:NoguestFPUtosave/restore.
Hostotherwiseidle(VCPUcontextswitchestoidleonHLT).
Hostpowermanagementnottheculprit.
KVMversusHardware20KVMHLTInternalsSoKVMisslowatdeliveringIPIsand/orcomingoutofHLT.
ButwhyPossibleculprits:WRMSRvmx_vcpu_runIPIISRVMEXITVMRESUMEVCPU1VCPU0(HLT-ed)kvm_sched_intimereturnfromschedule()inkvm_vcpu_block()kvm_vcpu_kick21VMRESUMEvmx_vcpu_runkvm_vcpu_kickKVMHLTInternalsSoKVMisslowatdeliveringIPIsand/orcomingoutofHLT.
ButwhyPossibleculprits:WRMSRIPIISRVMEXITVCPU1VCPU0(HLT-ed)kvm_sched_intimereturnfromschedule()inkvm_vcpu_block()22RDTSCRDTSCRDTSCRDTSCRDTSCKVMHLTInternalsWRMSRkvm_vcpu_kickreturnfromschedule()inkvm_vcpu_block()vmx_vcpu_runIPIISRVMEXITVMRESUMEMin(cycles):400600730032001300VCPU1VCPU0guesthostVT-xKVMSchedulerkvm_sched_in:4924001200850034001400Median(cycles):23Unsurprisingly,theschedulertakessometimetoruntheVCPUSlowevenintheuncontended,cache-hot,case.
ImagineiftheVCPUiscontendingforCPUtimewithotherthreads.
Experiment:Don'tscheduleonHLT.
JustpollfortheIPIinkvm_vcpu_block.
KVMHLTInternals24Whathappenswhenyoudon'tscheduleonHLTKVM(Alwaysschedule)5.
7usKVM(Neverschedule)1.
7usHardware(SandyBridge)0.
5usNeverschedule!
CyclesKVM(Alwaysschedule)KVM(Neverschedule)HardwareMin1380040001200Average158004400120050%ile149004300120090%ile160004500130099%ile2490069001300Max52100050000140025SimilarimprovementsonIvyBridge(5.
6us->1.
6us)Haswell(4.
9us->1.
5us).
Neverschedule!
WRMSRkvm_vcpu_kickreturnfromschedule()inkvm_vcpu_block()vmx_vcpu_runIPIISRVMEXITVMRESUMEAlwaysschedule:4001200850034001400VCPU1VCPU0guesthostVT-xKVMSchedulerNeverschedule:300130011004001200(mediancycles)26Neverschedule!
WeeliminatealmostallofthelatencyoverheadbynotschedulingonHLT.
Schedulingisoftentherightthingtodo.
LetotherthreadsrunorsavehostCPUpower.
Mostofthetimeimprovesguestperformance(lettheIOthreadsrun!
).
Canhurtperformance.
Seemicrobenchmark.
SeeTCP_RR.
27Halt-PollingStep1:PollForuptoXnanoseconds:IfataskiswaitingtorunonourCPU,gotoStep2.
Checkifaguestinterruptarrived.
Ifso,wearedone.
Repeat.
Step2:schedule()Scheduleoutuntilit'stimetocomeoutofHLT.
Pros:WorksonshortHLTs(Cons:IncreasesCPUusage(~1%foridleVCPUsifX=200,000ns)Doesnotappeartonegativelyaffectturboofactivecores.
28Halt-PollingMemcache:1.
5xlatencyimprovementWindowsEventObjects:2xlatencyimprovementReducemessagepassinglatencyby10-15us(includingnetworklatency).
29Halt-PollingMergedintothe4.
0kernel[PATCH]kvm:addhalt_poll_nsmoduleparameterThankstoPaoloBonziniUsetheKVMmoduleparameterhalt_poll_nstocontrolhowlongtopolloneachHLT.
Futureimprovements:Automaticpolltoggling(removeidleCPUoverheadbyturningpollingoff).
Automatichalt_poll_nsKVMwillset(andvary)halt_poll_nsdynamically.
Howtodothisisanopenquestion.
.
.
ideasLazyContextSwitchingEquivalentfeature,butavailableforanykernelcomponenttouse.
30ConclusionMessagePassingEvenloopbackmessagepassingrequiresvirtualization.
Beingidle(asaLinuxguest)requiresvirtualization.
Cross-CPUcommunicationrequiresvirtualization.
Halt-Pollingsaves10-15usonmessagepassinground-triplatency.
Remaininground-triplatency:4MSRwritestotheAPICtimer(3-5us)IPIsend(~2us)HLTwakeup(evenwithhalt-polling,stilladds~3us!
)31

TmhHost香港三网CN2 GIA月付45元起,美国CN2 GIA高防VPS季付99元起

TmhHost是一家国内正规公司,具备ISP\ICP等资质,主营国内外云服务器及独立服务器租用业务,目前,商家新上香港三网CN2 GIA线路VPS及国内镇江BGP高防云主机,其中香港三网CN2 GIA线路最低每月45元起;同时对美国洛杉矶CN2 GIA线路高防及普通VPS进行优惠促销,优惠后美国洛杉矶Cera机房CN2 GIA线路高防VPS季付99元起。香港CN2 GIA安畅机房,三网回程CN2 ...

小白云 (80元/月),四川德阳 4核2G,山东枣庄 4核2G,美国VPS20元/月起三网CN2

小白云是一家国人自营的企业IDC,主营国内外VPS,致力于让每一个用户都能轻松、快速、经济地享受高端的服务,成立于2019年,拥有国内大带宽高防御的特点,专注于DDoS/CC等攻击的防护;海外线路精选纯CN2线路,以确保用户体验的首选线路,商家线上多名客服一对一解决处理用户的问题,提供7*24无人全自动化服务。商家承诺绝不超开,以用户体验为中心为用提供服务,一直坚持主打以产品质量用户体验性以及高效...

ZoeCloud:香港BGP云服务器,1GB内存/20GB SSD空间/2TB流量/500Mbps/KVM,32元/月

zoecloud怎么样?zoecloud是一家国人商家,5月成立,暂时主要提供香港BGP KVM VPS,线路为AS41378,并有首发永久8折优惠:HKBGP20OFF。目前,解锁香港区 Netflix、Youtube Premium ,但不保证一直解锁,谢绝以不是原生 IP 理由退款。不保证中国大陆连接速度,建议移动中转使用,配合广州移动食用效果更佳。点击进入:zoecloud官方网站地址zo...

ivybridge为你推荐
18comic.fun黑色禁药http://www.lovecomic.cn/attachment/Fid_18/18_4_00d3b0cb502ea74.jpg这幅画名字叫什么?关键字什么叫关键词22zizi.com河南福利彩票22选52010175开奖结果丑福晋爱新觉罗.允禄真正的福晋是谁?他真的是一个残酷,噬血但很专情的一个人吗?百花百游百花净斑方效果怎么样?haole018.comhttp://www.haoledy.com/view/32092.html 轩辕剑天之痕11、12集在线观看789se.comwuwu8.com这个站长是谁?4400av.com在www.dadady.com 达达电影看片子很快的啊bbs2.99nets.com天堂1单机版到底怎么做广告法有那些广告法?还有广告那些广告词?
虚拟主机试用 备案未注册域名 域名查询软件 企业域名备案 parseerror css样式大全 gg广告 秒杀预告 php空间推荐 双11秒杀 上海电信测速网站 架设邮件服务器 外贸空间 lick 免费ftp wordpress中文主题 lamp什么意思 重庆服务器 SmartAXMT800 webmin 更多