supportssisoftwaresandra

sisoftwaresandra  时间:2021-04-01  阅读:()
PACUE:ProcessorAllocatorConsideringUserExperienceTetsuroHorikawa1,MichioHonda1,JinNakazawa2,KazunoriTakashio2,andHideyukiTokuda2,31GraduateSchoolofMediaandGovernance,KeioUniversity2FacultyofEnvironmentandInformationStudies,KeioUniversity,5322,Endo,Fujisawa,Kanagawa252-8520,Japan3JST-CREST,Japan{techi,jin,kaz,hxt}@ht.
sfc.
keio.
ac.
jp,micchie@sfc.
wide.
ad.
jpAbstract.
GPUacceleratedapplicationsincludingGPGPUonesarecommonlyseeninmodernPCs.
IfmanyapplicationscompeteonthesameGPU,theperfor-mancewilldecreasesignicantly.
Someapplicationshavealargeimpactonuserexperience.
Therefore,forsuchapplications,wehavetolimitGPUutilizationbytheotherapplications.
Itmightbestraightforwardtomodifyapplicationstoswitchcomputedevicedynamicallyforintelligentresourcesallocation.
Unfortu-nately,wecannotdosoduetosoftwaredistributionpolicyortheotherreasons.
Inthispaper,weproposePACUE,whichallowstheendsystemtoallocatecomputedevicesarbitrarytoapplications.
Inaddition,PACUEguessesoptimalcomputedeviceforeachapplicationaccordingtouserpreference.
WeimplementedthedynamiccomputedeviceredirectorofPACUEincludingOpenCLAPIhookinganddevicecamouagingfeatures.
WealsoimplementedtheframeoftheresourcemanagerofPACUE.
WedemonstratePACUEachievesdynamiccomputedeviceredirectingononeoutoftworealapplicationsandonallof20samplecodes.
Keywords:Resourcemanagement,OpenCL,binarycompatibility,GPU,GPGPU,PC,userexperience.
1IntroductionGraphicsProcessingUnit(GPU)usehasbeenextendedtoawiderrangeofcomput-ingpurposesonthePCplatform.
GPUutilizationpurposesonPCscanbeclassiedintofourpurposes.
Therstis3Dgraphicscomputation,suchas3Dgamesand3D-graphics-basedGUIshell(e.
g.
,WindowsAero).
Thesecondis2Dgraphicsaccelera-tion,suchasfontrenderinginmodernwebbrowsers.
Thethirdisvideodecodingandencodingacceleration.
VideoplayerapplicationsusethevideodecodingaccelerationfunctionoftheGPUtoreduceCPUloadandtoincreasethevideoquality.
Also,someofGPUshavevideoencodingaccelerationunitsonthedieoftheGPU.
Thelastpurposeisgeneral-purposecomputing,calledGeneral-PurposecomputingonGPU(GPGPU).
OnPCs,GPGPUisoftenusedbyvideoencodingapplicationsandphysicssimulationapplicationsincluding3Dgames.
11Some3DgamesutilizeGPUforgeneral-purposecomputingbesides3Dgraphicsrendering.
M.
Alexanderetal.
(Eds.
):Euro-Par2011Workshops,PartII,LNCS7156,pp.
335–344,2012.
cSpringer-VerlagBerlinHeidelberg2012336T.
Horikawaetal.
Intoday'sPCsGPUsareutilizedefciently,becauseonlyafewoftheapplicationsareacceleratedatthesametime;theseapplicationsdonotcompeteeachotheronthesameGPU.
Applicationsthuschoosecomputedevicesstatically,suchasbyuserselec-tionintheapplicationcongurationmenuoftheGUIinterface.
However,weenvisagethatmoreandmoreapplicationsutilizeGPUs.
Forexample,OpenComputingLanguage(OpenCL)[2]allowsapplicationstoselectthecomputedeviceexplicitlytoexecutesomepartsoftheapplication.
Therefore,efcientloadbal-ancingbetweencomputedevicesconsistingofCPUsandGPUsisessentialforfutureconsumerPCs.
TherearethreetechnicalchallengestoachieveefcientcomputedeviceassignmentofheterogeneousprocessorsinPCs.
First,GPUaccelerationisutilizedforvariouspur-poses,whileGPUsareutilizedmainlyforgeneral-purposecomputinginsupercomput-ers.
Inaddition,someoftasksrunninginPCsstronglyrequirespecicprocessors.
Forexample,3DrenderingisnormallyprocessedbyGPUs,andsomeof3Dgraphicstrans-actionscannotbeprocessedbyCPUs,whereassomeapplicationscanbeprocessedbybothCPUsandGPUs.
WhentheGPUloadishigh,wecouldrunthelatterapplicationsexplicitlyonCPUs.
Second,wemustnotmodifyapplications.
Typically,mostofapplicationsinstalledinmajorOSessuchasWindowsandMacOScannotbemodiedbyathirdperson,duetotheirsoftwaredistributionpolicies.
Applicationvendorsmaynotbewillingtomodifytheirapplicationseither,becauseitwillnotbenetthemstraightforwardly.
Forthesereasons,existingruntimelibrariesorlibrariestodistributetasksbetweencomputedevices[6,10,7]proposedforHPCarenotdeployableonconsumerPCs.
Third,performancemetricforconsumerPCsiscomplicated,becauseuserpreferenceisoneofthemostimportantmetricsforassigningcomputedevicestoapplications.
ItisclearlydifferentfromgeneralHPC'smetricswhosetaskdistributingpolicyisusuallystatic,suchasmaximizingtasktransactionspeedormaximizingperformanceperwatt.
InPCs,taskdistributingpoliciesandmeritseasilychangedependingontheuse.
Forexample,whentheuserwouldliketoplaythe3Dgamesmoothly,theotherGPGPUtasksshouldnotbeassignedtotheGPU.
Ontheotherhand,sometimestheusermightbewillingtotranscodevideosquicklyratherthanplayingthetriinggamesmoothly.
Thecomputedeviceselectingmethodmustrecognizeuserpreferencestodecidethepropercomputedevicetoassign.
Howeverthisishard,thususerpreferencerecognizingcannotautomate.
Therefore,theresourcemanagementhastoinferPCutilizationandtheusershavetobeabletotellhowtheyareusingPCatthattime.
Inthispaper,weproposePACUEwhichallocatescomputedevicestoapplicationsefciently.
PACUEhastwofeatures,oneisdynamiccomputedeviceredirectingfeatureandtheotherissystem-wideoptimaldeviceselectingfeature.
Westronglyfocusonsolvingrealproblemswhichwilloccurwhenwedistributeoursystemovertheworldviaweb.
Therefore,wepreferchoosingpoliticallysafermethodratherthantechnicallybettermethod.
Thus,rstadvantageofPACUEisthepossibilityofthedeployment.
ThesecondadvantageofPACUEisdesignedtomaximizePCusers'experience.
Thus,webringanewmetricforusingaccelerators,anditwillbealsobenecialforothercomputerssuchassmartphonesorgameconsoles.
PACUE:ProcessorAllocatorConsideringUserExperience337OurexperimentalresultsshowthatPACUEcanswitchcomputedevicesin1outof2applications,andallof20samplecodesbuiltwithOpenCL.
Thereminderofthispaperisorganizedasfollows:InSec.
2,wedescribethedesignofPACUEconsistingofthedynamiccomputedeviceredirectingandthesystemresourcemanager.
InSec.
3,weevaluateourprototypeimplementation.
ThepaperconcludeswithSec.
4.
2DesigningPACUEPACUEisconstructedbytwocomponents;DynamicComputeDeviceRedirectorandResourceManager.
WefocusonapplicationsbuiltwithOpenCL,awidelyusedframe-workwhichsupportsmanytypesofcomputedevicessuchasCPUsandGPUs.
2.
1DynamicComputeDeviceRedirectionWedesigntheDynamicComputeDeviceRedirection(DCDR)methodtomeetthe"noapplicationmodication"requirement.
DCDRimplementsOpenCLAPIhookingthatconcealsactualcomputedevicesfromapplications,andavoidserrorcausedbyinconsistentinformationofdevices.
OpenCLAPIHooking.
OpenCLabstractscomputedevicesandmemoryhierarchytoutilizeheterogeneousprocessorswithinitsprogrammingmodel.
Toutilizeacom-putedevice,applicationscallOpenCLAPIsandspecifyacomputedevice.
Assigningprocessarefollowing:Secondly,selectpossibledevicesandcreateanOpenCLcon-text.
Thirdly,selectonedevicetouseandcreateacommandqueue.
Lastly,puttaskstothequeuecreatedabove.
Inthesecondandthethirdsteps,theapplicationspeciesaconcretedevicebecauseOpenCLAPIsneedsdeviceIDasitsparameter,whichmakessystem-wideoptimaldeviceselectionimpossible.
Foroptimaldeviceselection,were-movetherestrictionthattheapplicationsneedtochoosethedevicebyitselfbecausethedecisionishardforapplicationsandusers.
However,decisionsbyapplicationsorusersarerarelyoptimal(SeeSec.
2.
2).
PACUEhooksapartofOpenCLAPIswhichconcerndeviceselecting,andimplementsaskingfunctionthataskswhichdevicetoutilize.
ThereareseveralmethodstohookAPIsinWindows7wherePACUEisimple-mented.
TherstpossibilityismakingathreadinthetargetapplicationbycallingaWindowsAPICreateRemoteThread()[12].
Withthismethodweimplementanapplica-tionwhichmakeathreadinotherapplicationsandmapexternalDLLcontainingover-riddentargetAPIs.
However,theseapplicationsandDLLsarehardtoimplementduetocomplicatedprocedures.
Ithasariskbeingtreatedasmalwarebytheanti-malwaresoft-ware.
ThesecondpossibilityisGlobalHook,theuserapplicationhooksspecicAPIsofallapplicationbycallingWindowsAPISetWindowsHookEx()[13].
Thismethodisunsafe,becauseithasariskofhookingunknownapplicationsandcausingunexpectedaffecttothem.
ThethirdpossibilityismakingWrapperDLL,whichisaDLLwiththesamelenameoforiginalDLLandhasallAPIsoforiginalDLL.
WrapperDLLisalmostshelloforiginalDLL,becausemostAPIsaresimplycallsoriginalDLLAPIsexceptAPIswhichactuallyneedtododifferenttransactionfromoriginal.
ThismethodhasthemostchanceofhookingAPIs,becausewrapperDLLlocatedintheapplica-tiondirectoryisalwaysloadedpriortotheotherones,suchasDLLslocatedinsystem338T.
Horikawaetal.
Fig.
1.
DynamicComputeDeviceSwitchingbyOpenCLAPIHookingdirectoriesbydefault.
Inaddition,whenlocatingwrapperDLLinthedirectorywhichtargetEXElocated,onlyaffectsapplicationswhosebinaryislocatedinthesamedi-rectory.
Therefore,thisisreallysafewaytohookAPIs.
ThelastpossibilityistheuseofAPIhooklibraries,suchas[14].
Theselibrariesareeasytouse,howeverithaslessprobabilitytosuccesstohookAPIsthanWrapperDLL.
Italsohasarisktobetreatedasmalware.
Fromthiscomparison,weadopttheWrapperDLLmethod.
Fig.
1illus-tratesthearchitecturetohookOpenCLAPIswiththismethod.
OthermajorPCOSessuchasMacOSorLinuxdonotprovideanyfunctionlikewrapperDLLs,stillwecanimplementasimilarsystembyusingAPIhookingfunctionsofferedbyotherOSes.
Anothermethodtoswitchdevicesismakingavirtualdevice.
[5]Onthismethod,ap-plicationswillassignthevirtualdeviceandtheresourcemanagementsystemchoosearealdevice.
Thismethodhasasignicantadvantagethatitcanswitchrealdevicesatanytime,howeveritmayconictwithInstallableClientDriver(ICD)systemofOpenCL.
InstallerofOpenCLruntimelibrariesdistributedbyhardwarevendorssometimesover-write"OpenCL.
dll"le,thusinstallingavirtualdeviceorshowingapplicationsonlythevirtualdeviceisdifcultonPCs.
DeviceInformationCamouaging.
Whenapartofapplications'tasksareassignedtoPACUEselectedOpenCLdevice,someapplicationsshowerrors.
Thisisbecausedeviceinformationisdifferentfromtheapplication'sintendedone,thussomeapplicationsrecognizeitasanunusualevent.
Toavoidtheseerrors,PACUEcamouagesOpenCLdevicedetailswhenthedesiredOpenCLdevicehasbeenchangeddynamically.
However,camouagingOpenCLdevicedetailsisrisky,becausedeviceshavediffer-entspecicationsinthelowerlevel.
Therstriskisapplicationstability.
Thememorysizeofeachhierarchyisdevicedependent,hencetheunexpectedmemorysizecanre-sultinapplicationcrashorerror.
Thesecondriskisexecutionspeed.
Ifanapplicationimplementsper-deviceoptimization,mismatchbetweentheintendeddeviceandtheas-signeddevicecanresultinunexpectedperformancedegradation.
Fromthesereasons,weshouldcamouagesdevicedetailsonlywhenitisnecessary.
Tominimizetherisks,PACUEcamouagesdevicesinfollowinglevels.
1.
DevicetypelevelcamouageWhenanapplicationtriestoacquireanOpenCLdevicelist,PACUEwillover-writethecldevicetypevalue.
Asfaraspossible,PACUEwillchangethisvalueforCLDEVICETYPEALL.
Showingalldevicesinsteadofthespecictypede-vicesisareasonablechoice,becauseitavoidsforcingapplicationusingunknownPACUE:ProcessorAllocatorConsideringUserExperience339Table1.
ComparisonofDeviceCamouagingMethodsOverriddendevicetype/IDSpeciedTypewhengettingdevicelistSpeciedIDwhencreatingaContextSpeciedIDwhencreatingaCommandqueuecreationCrash/ErrorRiskCompatibilityA.
DevicetypelevelCPUsorGPUsAllCPUsorallGPUs\LowMostapplica-tionsB.
Contextlevel\CPUsorGPUsLowLowC.
Commandqueuelevel\ALLOneCPUoroneGPUHighMostapplica-tionsD.
A+CALLALLOneCPUoroneGPUNormalHighdevice.
Occasionally,applicationscannotexecutetheirOpenCLcodeonsomede-vicetypes.
Inthiscase,PACUEsetsthecldevicetypevaluetothedesiredtype,suchasCLDEVICETYPECPUorCLDEVICETYPEGPU.
2.
ContextlevelcamouageWhencreatinganOpenCLcontext,PACUEoverridesthecldeviceidvalueandforceOpenCLframeworktobuildOpenCLbinariesforeachcomputedevice.
IfPACUErecognizethatthetargetapplicationsupportonlyspecictypeofcomputedevices,PACUEwilloverwritethecldeviceidvalueandlimitdevicetypesforcontext.
Inaddition,PACUEoverridesthecldeviceidvaluewhenapplicationsrequestsdetaileddeviceinformation.
Therefore,applicationwillseeinformationofthedevicePACUEselected.
Thiscontributestoapplication'sstability,becauseacquireddeviceinformation,suchasthememorysizecorrespondstothatofthedeviceactuallywillbeused.
3.
CommandqueuelevelcamouageWhentheapplicationcallsclCreateCommandQueue()API,thisisthelastchancetochangethedevice.
Becauseofthestabilityissuedescribedabove,PACUEtriesnottochangedevicethistiming,butifnecessary,PACUEchangescldeviceidinargumentsofthisAPI.
Inthissituation,thedeviceiscamouagedcompletely,thustheapplicationrecognizesthecamouageddeviceasthedeviceapplicationspeci-ed.
Thisisaterriblydangerouswaytochangedevice,stillitimprovesapplicationcompatibility.
Thisisriskyintermsofdevicedependentcharacteristics,suchasthememorysize,however,wecanswitchtheprocessorinmoreapplicationswiththismethod.
Hence,thismethodisaceinthehole.
AsshowninTable1,thereareseveraldeviceassignmentoverridingwaysbythecom-binationofthesesteps.
Becausetheyhaveatrade-offbetweenapplicationcompatibilityandapplicationstability,wehavetomakearuleforapplyingthesemethods,andsomehintsareguredoutinSec.
3.
2.
2SystemResourceManagementWeneedasystem-wideresourcemanagerforheterogeneousprocessors,becauseav-eragePCuserscannotchoosepropercomputedeviceforeachapplication,anditis340T.
Horikawaetal.
inconvenientthattheyselectcomputedeviceeverytimetheapplicationruns.
Somead-vancedPCuserscanchoosepropercomputedevicemanually,howeveritisterriblyinconvenient.
Besides,manyPCusersdonotknowdetailedconstructionofthePCtheyareusing.
Theseuserscannotchoosethepropercomputedevicewhichsatisestheirpreferenceaccurately,eveniftheapplicationallowstheusertoselectthecomputede-viceonitsGUIcongurationmenu.
Forachievinghighuser-experience,theresourcemanagershouldselectacomputedeviceautomaticallyaccordingtouser'spreferences.
TherearemanystudiesinHPCareathatbuildaresourcemanagertoselectcomputedeviceautomatically[7,8].
Theyshowtaskdistributingalgorithmforheterogeneousprocessorsenvironmentthatoptimizedforsomespecicpurposes,suchasmaximizingperformanceormaximizingperformance-per-watt.
However,theycannotbeappliedtoresourcemanagementonPCbecausetherequirementsaredifferentbetweenPCandHPC.
Theotherapproachtodifferentiatetasks,suchasdevice-driverlevelapproach[9]wouldbeapossibilityforourgoal.
However,westillneedasystemwideresourcemanagertoconsiderheterogeneousprocessorsandapplications.
Thesearethreere-quirementsoftheresourcemanagerespeciallyforPCs.
–ConsideringuserpreferenceAPCuser'spreferenceoftenchangesandtheyarenotsimpleobjectssuchasmax-imizingperformance.
Inaddition,itisdifculttorecognizewhichapplicationisreallyimportant,becausewerarelyspecifypriorityoftheprocessexplicitly.
There-fore,wehavetobuildaresourcemanager,whichinfersuser'spreferencebycol-lectingPCutilizationstatusandchoosescomputedevicesforeachapplicationtoachieveuserpreferenceaccurately.
–SupportingvarioushardwarecongurationsThereareplentyofPChardwarecomponentsandapplications.
Becauseofthisreason,combinationofhardwarecomponentsandapplicationsareinnumerable.
Inaddition,thespecicationsofcomponentsdependontechnologytrends.
Forinstance,somenewGPUvirtualizationtechnologiesforPCsuchasVirtuGPUvirtualization[11]seamlesslyusediscreteGPUwhenspecicAPIscalled.
Thus,wehavetobuildresourcemanagerthatsupportsvarioushardwarecongurations.
–SupportingvariousruntimeversionsInstalledruntimelibrariesforparallelcomputingmayvaryinPCs.
Applicationexecutionspeedsarenotonlydependsonhardware,butalsodependsonruntimelibrarieslikeOpenCLframeworks.
Thus,acomputedeviceselectingalgorithmop-timizedforspecicruntimeversion,suchasdesignedforHPC,maynotshowgoodresultsonthenewerversionruntimelibraries.
Wehavetobuildcomputedevicese-lectingalgorithmsthatdonotdependonaspecicruntimeversion.
Thisresourcemanagerhasthreefeaturesforsatisfyingtherequirementsexplainedabove.
Therstfeatureisinformationgathering.
PACUEcollectsinformationabouthowPCisutilized,suchaswhetheranACadapterisconnected,temperaturesandvolt-agesofcomponents,andprocessorutilizationlevelsuchasprocessorloadsandtherunningapplicationslist.
Thesecondfeatureistheuserpreferenceinferringfeature.
Theuserdescribestheirrequirementsbycreatingseveralrequirementpatterns.
PACUEinferswhichpatternisthebestforthepresentsituationbyusinginformationacquiredinPACUE:ProcessorAllocatorConsideringUserExperience341therststep.
Thethirdfeatureiscomputedeviceselection,whichdecidestheOpenCLdevicetobeassignedtoeachapplication.
Weplantoimplementafewcomputedeviceselectingalgorithmsforseveraluserpreferencepatterns.
PACUEwillassigncomputedevicestoeachapplicationbasedonthealgorithmwhichmatchestheinferredpatternofuserpreference.
Theresourcemanagerworksascyclesofthesesteps:1.
CollectPCutilizationinformation.
2.
Guesswhichproleisthebestforthepresentcondition.
3.
Waitaninquiryofapplicationandanswerwhichdeviceshouldbeused.
Forevaluationpurpose,webuiltabasicresourcemanagerwhichhascommunicationfunctiontoorderapplicationstoutilizespeciccomputedevice.
Becauseoflackofuserpreferencebasedcomputedeviceselectingalgorithms,recentPACUEcanonlyselectcomputedevicebymanualselectionintheresourcemanagerGUI.
Still,itcanreceiveaninquiryofcomputedeviceselectionandansweracomputedevicetoutilize.
3EvaluationInthissectionweconrmPACUEprovidescomputedevicesredirectioncapabilityforapplicationswithoutmodicationonwidelyusedapplications.
Werststatethepolicyoftheevaluation,thenshowandanalyzetheresults.
3.
1EvaluationPolicyWeevaluatePACUEinaPCwithIntelCorei7-920CPUandAMDRADEONHD4850GPU.
AsOpenCLframework,weadoptx86binaryofATIStreamSDK2.
2[4].
ThisframeworksupportsbothCPUsandAMDRADEONGPUsasOpenCLdevices.
Astestingapplications,wechosethefollowings.
Theyarepubliclyreleasedandwidelyusedforbenchmarking,thussuitesourpurpose.
–DirectCompute&OpenCLBenchmark[1]–SiSoftwareSandra2011[15]–Samplecodeof"OpenCLIntrodouction"book[3]Weswitchthedevicetoutilizefortheseapplications,andcomparethemethodsfordeviceswitchingforeachoftheseapplications.
3.
2ResultsDirectCompute&OpenCLBenchmark.
Table2showstheresults.
PACUEcanredirectcomputedeviceperfectlyonDirectCompute&OpenCLBenchmark,butonlywithmethodD.
SiSoftwareSandra2011.
Deviceswitchingfailed.
WhenPACUEtriedtoswitchthedevice,Sandra2011exhibitedstrangebehavior,suchasshowingthesamedevicetwiceintheGUI.
BecauseSandra2011isaninformation&diagnosticutilityforPC,itgathersdeviceinformationbyvariousAPIs.
Thus,thefailuremaybecausedbythelackofintegritybetweendeviceinformationgatheredbyPACUEhookedOpenCLAPIandinformationgatheredbyotherAPIs.
However,PACUEdonotmakeSandracrashed.
342T.
Horikawaetal.
Table2.
ResultofDirectCompute&OpenCLBenchmarkOverrideMethodA-1A-2B-1B-2C-1C-2D-1D-2SpeciedDeviceTypeCPUGPU\\\\ALLALLSpeciedDeviceIDforContext\\CPUsGPUsALLALLALLALLSp.
Dev.
IDforCommandQueue\\\\CPUGPUCPUGPUApplicationRecognizedDevicesCPU*2GPU*2CPU*1GPU*1CPU*1CPU*1CPU*1+GPU*1CPU*1+GPU*1DynamicDeviceSwitchingImpossibleImpossibleStaticStaticStaticStaticDynamicDynamicSampleCodesof"OpenCLIntroduction"Book.
Thesecodesareasetof20sampleapplicationsofOpenCLAPIs.
Thedeviceswitchingsucceededforallapplicationsinthem.
However,1sampleusesdevicememoryinformationfortheoptimizedarraysize,thustheresultmightdependonthedevice.
Thecompletecamouagingdeviceinfor-mationmightthusbeincompatiblewiththeinformationexpectedbythesample.
Thiscancausetheapplicationcrashingorerrors,howeveritseemedtobeworkingcorrectlywhiletheexperiment.
3.
3AnalysisTheresultsshowthatPACUEcanswitchthecomputedevicesonrealapplications.
However,itfailsfordevicedependentapplications.
Theyusedetailedinformationoftheparticulardevice,suchasdevicememorysize.
Thus,theymaycrashorbehavestrangelybecauseoftheinformationcamouagedbyPACUE.
Amongcombinationsofthedeviceinformationoverriding,wefoundtheproperor-dertoapplyonapplications.
ShowninTable1,thesemethodshaveatrade-offbetweenapplicationstabilityandapplicationcompatibility.
Inourevaluation,wefoundthatthecompletecamouagingmethodsignicantlyincreaseapplicationcompatibilityforrealapplications,suchasDirectCompute&OpenCLBenchmark.
However,itisrealizedbygivingapplicationstheinformationofthedevicetheapplicationspecied,insteadofgivingthedeviceinformationactuallyusing.
Originalapplicationcreatoristheonlyonewhoknowsiftheapplicationworkscorrectlywhenusingthecompletecamouag-ingmethod,thusweshouldavoidusingthisriskymethodifpossible.
Ingeneral,wesuggestthefollowingmethodapplyingorder;1.
OverridedevicetypeALLandoverridedeviceidwhencreatingcontext.
(Table1B)2.
OverridedevicetypeALLandoverridedeviceidwhencreatingcommandqueue.
(Table1D)3.
Keeporiginaldevicetypeandoverridedeviceidwhencreatingcommandqueue.
(Table1C)4.
OverridedevicetypeCPUorGPUwhenapplicationrequestslistofavailablede-vices.
(Table1A)Thersttothethirdmethodssimilarlyrealizedynamicdeviceselection.
Theupperissafer,thelowerhasmorecompatibility.
Applicationsthatcannotswitchdeviceswiththerstmethodshouldusethesecondorthethirdmethod.
Thelastonehasthehigh-estcompatibilitybutitonlyprovidesstaticandrestrictivedeviceswitching.
Thus,thismethodshouldbeappliedwhenallothermethodsfail.
PACUE:ProcessorAllocatorConsideringUserExperience3434ConclusionsandFutureWorkInthispaperwepresentedPACUE.
First,PACUEswitchesthecomputedevicesdynam-icallyforapplicationsonPCswithheterogeneousprocessors.
Second,PACUEchoosescomputedevicesassignedtoapplicationstomeettheuser'srequirement.
Weconductedexperimentsofourimplementation,anddemonstratedthat1outof2realOpenCLap-plications,andallof20sampleprogramscanchangethecomputedevicedynamicallywiththedynamiccomputedeviceredirector.
Inaddition,weshowedthatafewde-viceinformationcamouagingmethodssignicantlyincreaseapplicationcompatibil-ity.
Fromabovework,wedemonstratedpotentialavailabilityofthedynamiccomputedeviceredirectingwithoutapplicationmodied.
However,thereare2technicaldisad-vantagesinPACUE.
TherstdisadvantageisthatPACUEcanswitchdevicesonlywhencreatingcommandqueue.
Thisisbecausethereisnosupportfordynamicdeviceswitch-inginOpenCL,thusthechancesforswitchingdevicesarelimited.
Wewillinvestigateothermethodstoexpandthechancesforswitchingdevices,alsowewillinvestigatethefrequenciesofthedeviceswitchingtimingonotherAPIs.
TheseconddisadvantageisOpenCLkerneloptimization.
Becauseofdeviceinformationcamouaging,thereisapossibilityofexecutingkernelsdesignedforotherdevices.
Thismaydecreasetheper-formancesignicantly,thusweshouldavoidmakingsituationslikethat.
OneansweriscachingeverytypeofkernelsourcecodesbyAPIhooking,andswitchitaccordingtothedeviceactuallyusing.
Anotheranswerisapplyingjust-in-timeOpenCLcodeopti-mizationtechniquetoimproveperformance.
However,bothofthemcaninterferethecopyrightlaworlicensesoftheapplications.
Therefore,itmaybedifculttoapplyitforPCapplications.
Becauseofthisreason,wecontinueimprovingcamouagemethodsandwewillavoidshowingdifferentdevicesinformationaspossibleaswecan.
Forourresearchgoals,wehavetheseongoingworks:IncreaseCompatibilityforApplications.
WewilladdresstheproblemthatPACUEcannotswitchcomputedevicesinsomeapplications.
Alsowewillexperimentapplica-tionstabilitytestsonapplications.
EvaluateinManyHardwareEnvironment.
WewillconductexperimentsonmorehardwarecongurationsuchasVirtu,andimprovehardwaresupportofPACUE.
ImplementtheUserPreferencesHandlerintheResourceManager.
Weassumethatthereareseveralpatternsdescribinguserpredenedrequirements(e.
g.
,playingimportantgamewiththeACadaptor,andhastylecompressionwithunremarkablevideoencoding).
PACUEinfersmatchingpatternfromtheuser'sactivityandresourceutilization.
ImplementComputeDeviceSelectingAlgorithm.
Withuserrequirementrecogni-tion,weselectcomputedevicestofollowuserpreferenceaccurately.
Wewillimple-mentsomealgorithmsandparametersetsforeachuserrequirementpattern.
Also,wewillexploreperformanceimpactwhileredirectingcomputedeviceinrealapplicationsandtakemeasureagainstheavyperformancedegradation.
ShowingapplicationsnoOpenCLdevicebyoverridingOpenCLAPIscanbeoneoftheanswers.
Inthiscase,344T.
Horikawaetal.
applicationswilluseinternaloptimizedassemblytoexecuteitstransactionanditisoftenmuchfasterthanexecutingOpenCLcodeonCPUs.
However,ithasadisadvan-tagethatcomputedevicecannotchangeuntilrestartingtheapplication,becausetheapplicationwillnevercallOpenCLAPIsagain.
Therefore,wewillinvestigateeachapplication'sbehaviorconcretelytodecidehowtoletapplicationtouseCPUs.
SupportforOtherParallelComputingFrameworks.
Weplantoimplementmod-ulesforotherAPIssuchasFusionSystemArchitectureIntermediateLayerLanguage(FSAIL).
References1.
DirectCompute&OpenCLBenchmark,http://www.
ngohq.
com/graphic-cards/16920-directcompute-and-opencl-benchmark.
html(accessedonAugust21,2011)2.
OpenCL1.
1Specication,http://www.
khronos.
org/registry/cl/specs/opencl-1.
1.
pdf3.
FixtarsCorporation:OpenCLIntroduction-ParallelProgrammingforMulticoreCPUsandGPUs.
ImpressJapan(January2010)(inJapanese)4.
AMD.
ATIStreamTechnology,http://www.
amd.
com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.
aspx(accessedonAu-gust21,2011)5.
Aoki,R.
,Oikawa,S.
,Tsuchiyama,R.
,Nakamura,T.
:Hybridopencl:Connectingdifferentopenclimplementationsovernetwork.
In:Proc.
IEEECIT2010,pp.
2729–2735(2010)6.
Brodman,J.
C.
,Fraguela,B.
B.
,Garzaran,M.
J.
,Padua,D.
:Newabstractionsfordataparallelprogramming.
In:Proc.
USENIXHotPar,p.
16(2009)7.
Diamos,G.
F.
,Yalamanchili,S.
:Harmony:anexecutionmodelandruntimeforheteroge-neousmanycoresystems.
In:Proc.
ACMHPDC,pp.
197–200(2008)8.
Gupta,V.
,Schwan,K.
,Tolia,N.
,Talwar,V.
,Ranganathan,P.
:Pegasus:CoordinatedSchedul-ingforVirtualizedAccelerator-basedSystems.
In:Proc.
USENIXATC,pp.
31–44(2011)9.
Kato,S.
,Lakshmanan,K.
,Rajkumar,R.
,Ishikawa,Y.
:TimeGraph:GPUSchedulingforReal-TimeMulti-TaskingEnvironments.
In:Proc.
USENIXATC,pp.
17–30(2011)10.
Liu,W.
,Lewis,B.
,Zhou,X.
,Chen,H.
,Gao,Y.
,Yan,S.
,Luo,S.
,Saha,B.
:Abalancedpro-grammingmodelforemergingheterogeneousmulticoresystems.
In:Proc.
USENIXHotPar,p.
3(2010)11.
Lucidlogix.
Lucidlogixvirtu,http://www.
lucidlogix.
com/product-virtu.
html(accessedonAugust21,2011)12.
Microsoft.
CreateRemoteThreadFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms682437.
aspx(accessedonAugust21,2011)13.
Microsoft.
SetWindowsHookExFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms644990.
aspx(accessedonAugust21,2011)14.
MicrosoftResearch.
Detours-microsoftresearch,http://research.
microsoft.
com/en-us/projects/detours/(accessedonAugust21,2011)15.
SiSoftware.
Sisoftwarezone,http://www.
sisoftware.
net/(accessedonAugust21,2011)

UCloud云服务器香港临时补货,(Intel)CN2 GIA优化线路,上车绝佳时机

至今为止介绍了很多UCLOUD云服务器的促销活动,UCLOUD业者以前看不到我们的个人用户,即使有促销活动,续费也很少。现在新用户的折扣力很大,包括旧用户在内也有一部分折扣。结果,我们的用户是他们的生存动力。没有共享他们的信息的理由是比较受欢迎的香港云服务器CN2GIA线路产品缺货。这不是刚才看到邮件注意和刘先生的通知,而是补充UCLOUD香港云服务器、INTELCPU配置的服务器。如果我们需要他...

HoRain Cloud:国内特价物理机服务器,镇江机房,内地5线BGP接入,月付499元起

horain怎么样?horain cloud是一家2019年成立的国人主机商家,隶属于北京辰帆科技有限公司,horain持有增值电信业务经营许可证(B1-20203595),与中国电信天翼云、腾讯云、华为云、UCloud、AWS等签署渠道合作协议,主要提企业和个人提供云服务器,目前商家推出了几款特价物理机,都是在内地,性价比不错,其中有目前性能比较强悍的AMD+NVMe系列。点击进入:horain...

搬瓦工VPS:新增荷兰机房“联通”线路的VPS,10Gbps带宽,可在美国cn2gia、日本软银、荷兰“联通”之间随意切换

搬瓦工今天正式对外开卖荷兰阿姆斯特丹机房走联通AS9929高端线路的VPS,官方标注为“NL - China Unicom Amsterdam(ENUL_9)”,三网都走联通高端网络,即使是在欧洲,国内访问也就是飞快。搬瓦工的依旧是10Gbps带宽,可以在美国cn2 gia、日本软银与荷兰AS9929之间免费切换。官方网站:https://bwh81.net优惠码:BWH3HYATVBJW,节约6...

sisoftwaresandra为你推荐
月神谭求男变女类的变身小说777k7.comwww 地址 777rv怎么打不开了,还有好看的吗>comwww.baitu.com我看电影网www.5ken.com为什么百度就不上关键字呢www.bbb551.com100bbb网站怎样上不去了www.idanmu.com万通奇迹,www.wcm77.HK 是传销么?汴京清谈汴京繁华 简介50字?百度关键字在百度 输入任何关键词,可以搜出想要的内容,但是 搜索工具栏里面的字,却始终是同一个关键词, 如图蜘蛛机器人如何获得蜘蛛、机器人和爬虫的关注?4399宠物连连看2.54399游戏里的宠物连连看3.1版本,电脑网页有,为什么手机里没有呢?我想下这个版本在手机上,因为国风商讯国风快胃片多少钱
备案域名 深圳主机租用 VPS之家 免费注册网站域名 域名主机管理系统 免费域名解析 老左 香港主机 mach l5520 表单样式 国外php空间 刀片服务器是什么 asp免费空间申请 cdn加速是什么 cxz 服务器硬件配置 tracker服务器 美国西雅图独立 免费服务器 更多