algorithmswww.mywife.cc

www.mywife.cc  时间:2021-03-19  阅读:()
EmotionRecognitionInTheWildChallenge2013AbhinavDhallRes.
SchoolofComputerScienceAustralianNationalUniversityabhinav.
dhall@anu.
edu.
auRolandGoeckeVision&SensingGroupUniversityofCanberra/AustralianNationalUniversityroland.
goecke@ieee.
orgJyotiJoshiVision&SensingGroupUniversityofCanberrajyoti.
joshi@canberra.
edu.
auMichaelWagnerHCCLabUniversityofCanberra/AustralianNationalUniversitymichael.
wagner@canberra.
edu.
auTomGedeonRes.
SchoolofComputerScienceAustralianNationalUniversitytom.
gedeon@anu.
edu.
auABSTRACTEmotionrecognitionisaveryactiveeldofresearch.
TheEmotionRecognitionInTheWildChallengeandWorkshop(EmotiW)2013GrandChallengeconsistsofanaudio-videobasedemotionclassicationchallenges,whichmimicsreal-worldconditions.
Traditionally,emotionrecognitionhasbeenperformedonlaboratorycontrolleddata.
Whileun-doubtedlyworthwhileatthetime,suchlabcontrolleddatapoorlyrepresentstheenvironmentandconditionsfacedinreal-worldsituations.
ThegoalofthisGrandChallengeistodeneacommonplatformforevaluationofemotionrecog-nitionmethodsinreal-worldconditions.
Thedatabaseinthe2013challengeistheActedFacialExpressionInWild(AFEW),whichhasbeencollectedfrommoviesshowingclose-to-real-worldconditions.
CategoriesandSubjectDescriptorsI.
6.
3[PatternRecognition]:Applications;H.
2.
8[DatabaseApplications]:ImageDatabases;I.
4.
m[IMAGEPRO-CESSINGANDCOMPUTERVISION]:MiscellaneousGeneralTermsExperimentation,Performance,AlgorithmsKeywordsAudio-videodatacorpus,Facialexpression1.
INTRODUCTIONRealisticfacedataplaysavitalroleintheresearchad-vancementoffacialexpressionanalysis.
MuchprogresshasInitialpre-publishedversion,willbeupdatedinthefuture.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.
CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.
Abstractingwithcreditispermitted.
Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.
Requestpermissionsfrompermissions@acm.
org.
ICMI'13,December9–12,2013,Sydney,AustraliaCopyright2013ACM978-1-4503-2129-7/13/12.
.
.
$15.
00.
http://–enterthewholeDOIstringfromrightsreviewformconrmation.
beenmadeintheeldsoffacerecognitionandhumanac-tivityrecognitioninthepastyearsduetotheavailabilityofrealisticdatabasesaswellasrobustrepresentationandclassicationtechniques.
Withtheincreaseinthenumberofvideoclipsonline,itisworthwhiletoexploretheper-formanceofemotionrecognitionmethodsthatwork'inthewild'.
Emotionrecognitiontraditionallyhasbeenbeenbasedondatabaseswherethesubjectsposedaparticularemotion[1][2].
Withrecentadvancementsinemotionrecognitionvari-ousspontaneousdatabaseshavebeenintroduced[3][4].
Forprovidingacommonplatformforemotionrecognitionre-searchers,challengessuchastheFacialExpressionRecog-nition&Analysis(FERA)[3]andAudioVideoEmotionChallenges2011[5],2012[6]havebeenorganised.
Thesearebasedonspontaneousdatabase[3][4].
Emotionrecognitionmethodscanbebroadlyclassiedonthebasesoftheemotionlabellingmethodology.
Theearlymethodsanddatabases[1][2]usedtheuniversalsixemo-tions(angry,disgust,fear,happy,neutral,sadandsurprise)andcontempt/neutral.
Recentdatabases[4]usecontinuouslabellingintheValenceandArousalscales.
Emotionrecog-nitionmethodscanalsobecategorisedonthebasesofthenumberofsubjectsinasample.
Majorityoftheresearchisbasedonasinglesubject[3]persample.
Howeverwiththepopularityofsocialmedia,usersareuploadingimagesandvideosfromsocialeventswhichcontaingroupsofpeo-ple.
Thetaskherethenistoinfertheemotion/moodofthegroupofapeople[7].
Emotionrecognitionmethodsfurthercanbecategorisedonthetypeofenvironment:lab-controlledand'inthewild'.
Traditionaldatabasesandmethodsproposedonthemhavelab-controlledenvironment.
Thisgenerallymeansunclut-tered(generallystatic)backgrounds,controlledilluminationandminimalsubjectheadmovement.
Thisisnotthecor-rectrepresentativeofreal-worldscenarios.
Databasesandmethodswhichrepresentclose-to-real-worldenvironments(suchasindoor,outdoor,dierentcolorbackgrounds,occlu-sionandbackgroundclutter)havebeenrecentlyintroduced.
ActedFacialExpressionsInTheWild(AFEW)[8],GENKI[9],HappyPeopleImages(HAPPEI)[8]andStaticFacialExpressionsInTheWild(SFEW)[10],arerecentemotiondatabasesrepresentingreal-worldscenarios.
Formovingtheemotionrecognitionsystemsfromlabstothereal-world,itisimportanttodeneplatformswherere-searcherscanverifytheirmethodsondatarepresentingtheclose-to-real-worldscenarios.
EmotionRecognitionInTheWild(EmotiW)challengeaimstoprovideaplatformforresearcherstocreate,extendandverifytheirmethodsonreal-worlddata.
Thechallengeseeksparticipationfromresearcherswork-ingonemotionrecognitionintendtocreate,extendandval-idatetheirmethodsondatainreal-worldconditions.
Therearenoseparatevideo-only,audio-only,oraudio-videochal-lenges.
Participantsarefreetouseeithermodalityorboth.
Resultsforallmethodswillbecombinedintoonesetintheend.
Participantsareallowedtousetheirownfeaturesandclassicationmethods.
Thelabelsofthetestingsetareun-known.
Participantswillneedtoadheretothedenitionoftraining,validationandtestingsets.
Intheirpapers,theymayreportonresultsobtainedonthetrainingandvalida-tionsets,butonlytheresultsonthetestingsetwillbetakenintoaccountfortheoverallGrandChallengeresults.
Figure1:Thescreenshotdescribedtheprocessofdatabaseformation.
Forexampleinthescreenshot,whenthesubtitlecontainsthekeyword'laughing',thecorrespondingclipisplayedbythetool.
ThehumanlabellerthenannotatesthesubjectsinthesceneusingtheGUItool.
TheresultantannotationisstoredintheXMLschemashowninthebottompartofthesnapshot.
Pleasenotethatthestructureoftheinformationaboutasequencecontainingmul-tiplesubjects.
Theimageinthescreenshotisfromthemovie'HarryPotterandTheGobletOfFire'.
Ideally,onewouldliketocollectspontaneousdata.
How-ever,asanyoneworkingintheemotionresearchcommunitywilltestify,collectingspontaneousdatabasesinreal-worldconditionsisatedioustask.
Forthisreason,currentspon-taneousexpressiondatabases,forexampleSEMAINE,havebeenrecordedinlaboratoryconditions.
Toovercomethislimitationandthelackofavailabledatawithreal-worldorclose-to-real-worldconditions,theAFEWdatabasehasbeenrecorded,whichisatemporaldatabasecontainingvideoclipscollectedbysearchingclosedcaptionkeywordsandthenvalidatedbyhumanannotators.
AFEWformsthebasesoftheEmotiWchallenge.
Whilemoviesareoftenshotinsomewhatcontrolledenvironments,theyprovideclosetorealworldenvironmentsthataremuchmorerealisticthancurrentdatasetsthatwererecordedinlabenvironments.
WearenotclaimingthattheAFEWdatabaseisasponta-neousfacialexpressiondatabase.
However,clearly,(good)actorsattemptmimickingreal-worldhumanbehaviourinmovies.
Thedatasetinparticularaddressestheissueofemotionrecognitionindicultconditionsthatareapprox-imatingrealworldconditions,whichprovidesforamuchmorediculttestsetthancurrentlyavailabledatasets.
Itisevidentfromtheexperimentsin[8]thatautomatedfacialexpressionanalysisinthewildisatoughproblemduetovariouslimitationssuchasrobustfacedetectionandalignment,andenvironmentalfactorssuchasillumination,headposeandocclusion.
Similarly,recognisingvocalexpres-sionofaectinreal-worldconditionsisequallychallenging.
Moreover,asthedatahasbeencapturedfrommovies,therearemanydierentsceneswithverydierentenvironmen-talconditionsinbothaudioandvideo,whichwillprovideachallengingtestbedforstate-of-the-artalgorithms,unlikethesamescene/backgroundsinlabcontrolleddata.
Therefore,itisworthwhiletoinvestigatetheapplicabil-ityofmultimodalsystemsforemotionrecognitioninthewild.
Therehasbeenmuchresearchonaudioonly,videoonlyandtosomeextentaudio-videomultimodalsystemsbutfortranslatingemotionrecognitionsystemsfromlaboratoryenvironmentstothereal-worldmultimodalbenchmarkingstandardsarerequired.
2.
DATABASECONSTRUCTIONPROCESSDatabasessuchastheCK+,MMIandSEMAINEhavebeencollectedmanually,whichmakestheprocessofdatabaseconstructionlonganderroneous.
Thecomplexityofdatabasecollectionincreasesfurtherwiththeintenttocapturedier-entscenarios(whichcanrepresentawidevarietyofreal-worldscenes).
ForconstructingAFEW,asemi-automaticapproachisfollowed[8].
Theprocessisdividedintotwosteps.
First,subtitlesfromthemoviesusingboththeSubti-tlesforDeafandHearingimpaired(SDH)andClosedCap-tions(CC)subtitlesareanalysed.
Theycontaininformationabouttheaudioandnon-audiocontextsuchasemotions,informationabouttheactorsandthesceneforexample'[SMILES]','[CRIES]','[SURPRISED]',etc.
ThesubtitlesAttributeDescriptionLengthofsequences300-5400msNo.
ofsequences1832(AFEW3.
0)EmotiW:1088No.
ofannotators2ExpressionclassesAngry,Disgust,Fear,Happy,Neutral,SadandSurpriseTotalNo.
ofexpressions2153(AFEW3.
0)(someseq.
havemult.
sub.
)EmotiW:1088VideoformatAVIAudioformatWAVTable2:AttributesofAFEWdatabase.
DatabaseChallengesNaturalLabelEnvironmentSubjectsConstructionPerSampleProcessAFEW[8]EmotiWSpontaneousDiscreteWildSingle&Semi-(Partial)MultipleAutomaticCohn-Kanade+[1]-PosedDiscreteLabSingleManualGEMEP-FERA[3]FERASpontaneousDiscreteLabSingleManualMMI[2]-PosedDiscreteLabSingleManualSemaine[2]AVECSpontaneousContinousLabSingleManualTable1:ComparisonofAFEWdatabasewhichformsthebasesoftheEmotiW2013challenge.
Figure2:ThegurecontainstheannotationattributesinthedatabasemetadataandtheXMLsnippetisanexampleofannotationsforavideosequence.
Pleasenotetheexpressiontagsinformationwasremovedinthexmlmeta-datdistributedwithEmotiWdata.
areextractedfromthemoviesusingatoolcalledVSRip1.
ForthemovieswhereVSRipcouldnotextractsubtitles,SDHsubtitlesaredownloadedfromtheinternet2.
Theex-tractedsubtitleimagesareparsedusingOpticalCharacterRecognition(OCR)andconvertedinto.
srtsubtitleformat3.
The.
srtformatcontainsthestarttime,endtimeandtextcontentwithmillisecondsaccuracy.
Thesystemperformsaregularexpressionsearchwithkey-words4describingexpressionsandemotionsonthesubtitlele.
Thisgivesalistofsubtitleswithtimestamps,whichcontaininformationaboutsomeexpression.
Theextractedsubtitlescontainingexpressionrelatedkeywordswerethenplayedbythetoolsubsequently.
Thedurationofeachclipisequaltothetimeperiodofappearanceofthesubtitleonthescreen.
Thehumanobserverthenannotatedtheplayedvideoclipswithinformationaboutthesubjects5andex-pressions.
Figure1describestheprocess.
Inthecaseofvideoclipswithmultipleactors,thesequenceoflabellingwasbasedontwocriteria.
Foractorsappearinginthesameframe,theorderingofannotationislefttoright.
Iftheac-torsappearatdierenttimestamps,thenitisintheorderofappearance.
However,thedatainthechallengecontains1VSRiphttp://www.
videohelp.
com/tools/VSRipextracts.
sub/.
idxfromDVDmovies.
2TheSDHsubtitlesweredownloadedfromwww.
subscene.
com,www.
mysubtitles.
organdwww.
opensubtitles.
org.
3SubtitleEditavailableatwww.
nikse.
dk/seisused.
4Keywordexamples:[HAPPY],[SAD],[SURPRISED],[SHOUTS],[CRIES],[GROANS],[CHEERS],etc.
5Theinformationabouttheactorswasextractedfromwww.
imdb.
com.
videoswithsinglesubjectonly.
ThelabellingisthenstoredintheXMLmetadataschema.
Finally,thehumanobserverestimatedtheageofthecharacterinmostofthecasesastheageofallcharactersinaparticularmovieisnotavailableontheinternet.
Thedatabaseversion3.
0containsinformationfrom75movies66Theseventy-vemoviesusedinthedatabaseare:21,Aboutaboy,AmericanHistoryX,AndSoonCameTheDarkness,BlackSwan,Bridesmaids,ChangeUp,ChernobylDiaries,CryingGame,CuriousCaseOfBenjaminButton,DecemberBoys,DeepBlueSea,Descendants,DidYouHearAbouttheMorgans,DumbandDumberer:WhenHarryMetLloyd,FourWeddingsandaFuneral,FriendswithBen-ets,Frost/Nixon,Ghoshtship,GirlWithAPearlEarring,HallPass,Halloween,HalloweenResurrection,HarryPotterandthePhilosopher'sStone,HarryPotterandtheCham-berofSecrets,HarryPotterandtheDeathlyHallowsPart1,HarryPotterandtheDeathlyHallowsPart2,HarryPot-terandtheGobletofFire,HarryPotterandtheHalfBloodPrince,HarryPotterandtheOrderOfPhoenix,HarryPot-terandthePrisonersOfAzkaban,IAmSam,It'sCompli-cated,IThinkILoveMyWife,Jennifer'sBody,Juno,Lit-tleManhattan,MargotAtTheWedding,Messengers,MissMarch,NanyDiaries,NottingHill,OceansEleven,OceansTwelve,OceansThirteen,OneFlewOvertheCuckoo'sNest,OrangeandSunshine,PrettyinPink,PrettyWoman,PursuitofHappiness,RememberMe,RevolutionaryRoad,RunawayBride,Saw3D,Serendipity,SolitaryMan,Some-thingBorrowed,TermsofEndearment,ThereIsSomethingAboutMary,TheAmerican,TheAviator,TheDevilWearsPrada,TheHangover,TheHauntingofMollyHartley,TheInformant!
,TheKing'sSpeech,ThePinkPanther2,TheSocialNetwork,TheTerminal,TheTown,ValentineDay,Unstoppable,WrongTurn3You'veGotMail.
2.
1DatabaseAnnotationsThehumanlabelersdenselyannotatedthesubjectsintheclips.
Figure2displaystheannotationsinthedatabase.
Thedetailsoftheschemaelementsaredescribedasfollows:StartTime-ThisdenotesthestarttimestampoftheclipinthemovieDVDandisinthehh:mm:ss,zzzformat.
Length-Itisthedurationoftheclipinmilliseconds.
Person-Thiscontainsvariousattributesdescribingtheactorinthescenedescribedasfollows:Pose-Thisdenotestheposeoftheactor,basedonthehumanlabeler'sobser-vation.
AgeOfCharacter-Thisdescribestheageofthecharacterbasedonhumanlabeler'sobservation.
Infewcasestheageofthecharacteravailablewww.
imdb.
comwasused.
Butthiswasfrequentincaseofleadactorsonly.
NameOfActor-Thisattributecontainstherealnameoftheactor.
AgeOfActor-Thisdescribestherealageoftheactor.
Theinformationwasextractedfromwww.
imdb.
combythehumanlabeler.
Inveryfewcasestheageinformationwasmissingforsomeactors!
,thereforetheobservationalvalueswereused.
Gender-Thisattributedescribesthegenderoftheactor,againenteredbythehumanlabeler.
3.
EMOTIWDATAPARTITIONSThechallengedataisdividedintothreesets:'Train','Val'and'Test'.
Train,ValandTestsetcontain380,396and312clipsrespectively.
TheAFEW3.
0datasetcontains1832clips,forEmotiWchallenge1088clipsareextracted.
Thedataissubjectindependentandthesetscontainsclipsfromdierentmovies.
Themotivationbehindpartitioningthedatainthismanneristotestmethodsforunseenscenariodata,whichiscommonontheweb.
Fortheparticipantsinthechallenge,thelabelsofthetestingsetareunknown.
ThedetailsaboutthesubjectsisdescribedinTable3.
4.
VISUALANALYSISForfaceandducialpointsdetectiontheMixtureofParts(MoPs)framework[11]isappliedtothevideoframes.
MoPsrepresentsthepartsofanobjectasagraphwithnverticesV={v1,vn}andasetofedgesE.
Here,eachedge(vi,vj)∈Epairencodesthespatialrelationshipbetweenpartsiandj.
Afaceisrepresentedasatreegraphhere.
Formallyspeaking,foragivenimageI,theMoPsframeworkcomputesascoreforthecongurationL={li:i∈V}ofpartsbasedontwomodels:anappearancemodelandaspatialpriormodel.
Wefollow[12]'smixture-of-partsfor-mulation.
——TheAppearanceModelscoresthecondenceofapartspecictemplatewpappliedtoalocationli.
Here,pisaview-specicmixturecorrespondingtoaparticularheadpose.
φ(I,li)isthehistogramoforientedgradientdescrip-tor[13]extractedfromalocationli.
Thus,theappearanceSetNumofMaxAvgMinMalesFe-Subj.
AgeAgeAge-malesTrain9976y32.
8y10y6039Val12670y34.
3y10y7155Test9070y36.
7y8y5040Table3:Subjectdescriptionofthethreesets.
modelcalculatesascoreforcongurationLandimageIas:Appp(I,L)=i∈Vpwip.
φ(I,li)(1)TheShapeModellearnsthekinematicconstraintsbe-tweeneachpairofparts.
Theshapemodel(asin[12])isdenedas:Shapep(L)=ij∈Epapijdx2+bpijdx+cpijdy2+dpijdy(2)Here,dxanddyrepresentthespatialdistancebetweentwoparts.
a,b,canddaretheparameterscorrespondingtothelocationandrigidityofaspring,respectively.
FromEq.
1and2,thescoringfunctionSis:Score(I,L,p)=Appp(I,L)+Shapep(L)(3)Duringtheinferencestage,thetaskistomaximiseEq.
3overthecongurationLandmixturep(whichrepresentsapose).
Theducialpointsareusedtoalignthefaces.
Further,spatio-temporalfeaturesareextractedonthealignedfaces.
Thealignedfacesaresharedwithparticipants.
AlongwithMoPS,alignedfacescomputedbythemethodofGehrigandEkenel[14]isalsoshared.
4.
1VolumeLocalBinaryPatternsLocalBinaryPattern-ThreeOrthogonalPlanes(LBP-TOP)[15]isapopulardescriptorincomputervision.
Itconsiderspatternsinthreeorthogonalplanes:XY,XTandYT,andconcatenatesthepatternco-occurrencesinthesethreedirections.
Thelocalbinarypattern(LBP-TOP)de-scriptorassignsbinarylabelstopixelsbythresholdingtheneighborhoodpixelswiththecentralvalue.
ThereforeforacenterpixelOpofanorthogonalplaneOandit'sneighbor-ingpixelsNi,adecimalvalueisassignedtoit:d=XY,XT,YTOpki=12i1I(Op,Ni)(4)LBP-TOPiscomputedblockwiseonthealignedfacesofavideo.
5.
AUDIOFEATURESInthischallenge,asetofaudiofeaturessimilartothefea-turesemployedinAudioVideoEmotionRecognitionChal-lenge2011[16]motivatedfromtheINTERSPEECH2010Paralinguisticchallenge(1582features)[17]areused.
Thefeaturesareextractedusingtheopen-sourceEmotionandFunctionalsArithmeticMeanstandarddeviationskewness,kurtosisquartiles,quartilerangespercentile1%,99%percentilerangePositionmax.
/minup-leveltime75/90linearregressioncoe.
linearregressionerror(quadratic/absolute)Table5:SetoffunctionalsappliedtoLLD.
LowLevelDescriptors(LLD)Energy/SpectralLLDPCMLoudnessMFCC[0-14]logMelFrequencyBand[0-7]LineSpectralPairs(LSP)frequency[0-7]F0F0EnvelopeVoicingrelatedLLDVoicingProb.
JitterLocalJitterconsecutiveframepairsShimmerLocalTable4:Audiofeatureset-38(34+4)low-leveldescriptors.
AngryDisgustFearHappyNeutralSadSurpriseOverallValaudio42.
3712.
0025.
9320.
9712.
7314.
069.
6219.
95Testaudio44.
4420.
4127.
2716.
0027.
089.
305.
7122.
44Valvideo44.
002.
0014.
8143.
5534.
5520.
319.
6227.
27Testvideo50.
0012.
240.
0048.
0018.
756.
975.
7122.
75Valaudio-video44.
070.
005.
5625.
8163.
647.
815.
7722.
22Testaudio-video66.
670.
006.
0616.
0081.
250.
002.
8627.
56Table6:Classicationaccuracy(in%)forValandTestsetsforaudio,videoandaudio-videomodalities.
AectRecognition(openEAR)[18]toolkitbackendopenS-MILE[19].
Thefeaturesetconsistsof34energy&spectralrelatedlow-leveldescriptors(LLD)*21functionals,4voicingre-latedLLD*19functionals,34deltacoecientsofenergy&spectralLLD*21functionals,4deltacoecientsofthevoicingrelatedLLD*19functionalsand2voiced/unvoiceddurationalfeatures.
Table5describethedetailsofLLDfeaturesandfunctionals.
6.
BASELINEEXPERIMENTSForcomputingthebaselineresults,openlyavailableli-brariesareused.
Pre-trainedfacemodels(Facep146small,Facep99andMultiPIE1050)availablewiththeMoPSpack-age7wereappliedforfaceandducialpointsdetection.
Themodelsareappliedinhierarchy.
TheducialpointsgeneratedbyMoPSisusedforalign-ingthefaceandthefacesizeissetto96*96.
PostaligningLBP-TOPfeaturesareextractedfromnon-overlappingspa-tial4*4blocks.
TheLBP-TOPfeaturefromeachblockareconcatenatedtocreateonefeaturevector.
Non-linearSVMislearntforemotionclassication.
Thevideoonlybaselinesystemachieves27.
2%classicationaccuracyontheValset.
Theaudiobaselineiscomputedbyextractingfeaturesus-ingtheOpenSmiletoolkit.
AlinearSVMclassierislearnt.
Theaudioonlybasedsystemgives19.
5%classicationaccu-racyontheVal.
Further,afeaturelevelfusionisperformed,wheretheaudioandvideofeaturesareconcatenatedandanon-linearSVMislearnt.
Theperformancedropshereandtheclassicationaccuracyis22.
2%.
OntheTestsetwhichcontains312videoclips,audioonlygives22.
4%,videoonlygives22.
7%andfeaturefusiongives27.
5%.
Table6,describestheclassicationaccuracyfortheValandTestforaudio,videoandaudio-videosystems.
FortheTestsetthefeaturefusionincreasestheperformanceofthesystem.
However,thesameisnottruefortheVal7http://www.
ics.
uci.
edu/xzhu/face/AnDiFeHaNeSaSuAn251076146Di13649756Fe128146482Ha203813846Ne810516763Sa1215126298Su14777845Table7:Valaudio:Confusionmatrixdescribingper-formanceoftheaudiosubsystemontheValset.
AnDiFeHaNeSaSuAn260268116Di151046771Fe18385659Ha201527351Ne85771927Sa15346131310Su115481185Table8:Valvideo:Confusionmatrixdescribingper-formanceofthevideosubsystemontheValset.
set.
TheconfusionmatricesforvalandtestaredescribedinValaudio:Table7,Valvideo:Table8,Valaudio-video:Table9,Testaudio:Table10,Testvideo:Table11,Testaudio-video:Table12.
Theautomatedfacelocalisationonthedatabaseisnotalwaysaccurate,withasignicantnumberoffalsepositivesandfalsenegatives.
Thisisattributedtothevariedlight-ningconditions,occlusions,extremeheadposesandcomplexbackgrounds.
7.
CONCLUSIONEmotionRecognitionInTheWild(EmotiW)challengeisplatformforresearcherstocompetewiththeiremotionAnDiFeHaNeSaSuAn261271733Di400143011Fe1123141743Ha1102163021Ne710123500Sa702172855Su20373343Table9:Valaudio-video:Confusionmatrixdescribingperformanceoftheaudio-videofusionsystemontheValset.
recognitionmethodson'inthewild'data.
Theaudio-visualchallengedataisbasedontheAFEWdatabase.
Thela-belled'Train'and'Val'setsweresharedalongwithunla-belled'Test'set.
Meta-datacontaininginformationabouttheactorinthecliparesharedwiththeparticipants.
Theperformanceofthedierentmethodswillbeanalysedforin-sightonperformanceofthestate-of-artemotionrecognitionmethodson'inthewild'data.
8.
REFERENCES[1]PatrickLucey,JereyF.
Cohn,TakeoKanade,JasonSaragih,ZaraAmbadar,andIainMatthews.
Theextendedcohn-kanadedataset(ck+):Acompletedatasetforactionunitandemotion-speciedexpression.
InCVPR4HB10,2010.
[2]MajaPantic,MichelFrancoisValstar,RonRademaker,andLudoMaat.
Web-baseddatabaseforfacialexpressionanalysis.
InProceedingsoftheIEEEInternationalConferenceonMultimediaandExpo,ICME'05,2005.
[3]MichelValstar,BihanJiang,MarcMehu,MajaPantic,andSchererKlaus.
Therstfacialexpressionrecognitionandanalysischallenge.
InProceedingsoftheNinthIEEEInternationalConferenceonAutomaticFaceGestureRecognitionandWorkshops,FG'11,pages314–321,2011.
[4]GaryMcKeown,MichelFrancoisValstar,RoderickCowie,andMajaPantic.
Thesemainecorpusofemotionallycolouredcharacterinteractions.
InIEEEICME,2010.
[5]Bj¨ornSchuller,MichelFrancoisValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011-therstinternationalaudio/visualemotionchallenge.
InACII(2),pages415–424,2011.
[6]Bj¨ornSchuller,MichelValstar,FlorianEyben,RoddyCowie,andMajaPantic.
Avec2012:thecontinuousAnDiFeHaNeSaSuAn24469236Di141029743Fe8492424Ha17448575Ne68671362Sa12667345Su6569252Table10:Testaudio:ConfusionmatrixdescribingperformanceoftheaudiosubsystemontheTestset.
AnDiFeHaNeSaSuAn27334647Di14647648Fe9404925Ha95124146Ne111315963Sa833111035Su7565732Table11:Testvideo:ConfusionmatrixdescribingperformanceofthevideosubsystemontheTestset.
audio/visualemotionchallenge.
InICMI,pages449–456,2012.
[7]AbhinavDhall,JyotiJoshi,IbrahimRadwan,andRolandGoecke.
Findinghappiestmomentsinasocialcontext.
InACCV,2012.
[8]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
Asemi-automaticmethodforcollectingrichlylabelledlargefacialexpressiondatabasesfrommovies.
IEEEMultimedia,2012.
[9]JacobWhitehill,GwenLittlewort,IanR.
Fasel,MarianStewartBartlett,andJavierR.
Movellan.
TowardPracticalSmileDetection.
IEEETPAMI,2009.
[10]AbhinavDhall,RolandGoecke,SimonLucey,andTomGedeon.
StaticFacialExpressionAnalysisInToughConditions:Data,EvaluationProtocolAndBenchmark.
InICCVW,BEFIT'11,2011.
[11]P.
F.
FelzenszwalbandD.
P.
Huttenlocher.
PictorialStructuresforObjectRecognition.
IJCV,2005.
[12]XiangxinZhuandDevaRamanan.
Facedetection,poseestimation,andlandmarklocalizationinthewild.
InCVPR,pages2879–2886,2012.
[13]NavneetDalalandBillTriggs.
Histogramsoforientedgradientsforhumandetection.
InCVPR,pages886–893,2005.
[14]TobiasGehrigandHazmKemalEkenel.
Acommonframeworkforreal-timeemotionrecognitionandfacialactionunitdetection.
InComputerVisionandPatternRecognitionWorkshops(CVPRW),2011IEEEComputerSocietyConferenceon,pages1–6.
IEEE,2011.
[15]GuoyingZhaoandMattiPietikainen.
Dynamictexturerecognitionusinglocalbinarypatternswithanapplicationtofacialexpressions.
IEEETransactiononPatternAnalysisandMachineIntelligence,2007.
AnDiFeHaNeSaSuAn360121401Di1301151811Fe81241602Ha121282214Ne50033910Sa161181304Su101210921Table12:Testaudio-video:Confusionmatrixdescrib-ingperformanceoftheaudio-videofusionsystemontheTestset.
[16]Bj¨ornSchuller,MichelValstar,FlorianEyben,GaryMcKeown,RoddyCowie,andMajaPantic.
Avec2011–therstinternationalaudio/visualemotionchallenge.
InAectiveComputingandIntelligentInteraction,pages415–424.
SpringerBerlinHeidelberg,2011.
[17]Bj¨ornSchuller,StefanSteidl,AntonBatliner,FelixBurkhardt,LaurenceDevillers,ChristianAM¨uller,andShrikanthSNarayanan.
Theinterspeech2010paralinguisticchallenge.
InINTERSPEECH,pages2794–2797,2010.
[18]FlorianEyben,MartinWollmer,andBjornSchuller.
OpenearaAˇTintroducingthemunichopen-sourceemotionandaectrecognitiontoolkit.
InAectiveComputingandIntelligentInteractionandWorkshops,2009.
ACII2009.
3rdInternationalConferenceon,pages1–6.
IEEE,2009.
[19]FlorianEyben,MartinW¨ollmer,andBj¨ornSchuller.
Opensmile:themunichversatileandfastopen-sourceaudiofeatureextractor.
InACMMultimedia,pages1459–1462,2010.

ProfitServer$34.56/年,西班牙vps、荷兰vps、德国vps/不限制流量/支持自定义ISO

profitserver怎么样?profitserver是一家成立于2003的主机商家,是ITC控股的一个部门,主要经营的产品域名、SSL证书、虚拟主机、VPS和独立服务器,机房有俄罗斯、新加坡、荷兰、美国、保加利亚,VPS采用的是KVM虚拟架构,硬盘采用纯SSD,而且最大的优势是不限制流量,大公司运营,机器比较稳定,数据中心众多。此次ProfitServer正在对德国VPS(法兰克福)、西班牙v...

Hostodo:$34.99/年KVM-2.5GB/25G NVMe/8TB/3个数据中心

Hostodo在九月份又发布了两款特别套餐,开设在美国拉斯维加斯、迈阿密和斯波坎机房,基于KVM架构,采用NVMe SSD高性能磁盘,最低1.5GB内存8TB月流量套餐年付34.99美元起。Hostodo是一家成立于2014年的国外VPS主机商,主打低价VPS套餐且年付为主,基于OpenVZ和KVM架构,美国三个地区机房,支持支付宝或者PayPal、加密货币等付款。下面列出这两款主机配置信息。CP...

Hosteons:洛杉矶/纽约/达拉斯免费升级10Gbps端口,KVM年付21美元起

今年1月的时候Hosteons开始提供1Gbps端口KVM架构VPS,目前商家在LET发布消息,到本月30日之前,用户下单洛杉矶/纽约/达拉斯三个地区机房KVM主机可以从1Gbps免费升级到10Gbps端口,最低年付仅21美元起。Hosteons是一家成立于2018年的国外VPS主机商,主要提供VPS、Hybrid Dedicated Servers及独立服务器租用等,提供IPv4+IPv6,支持...

www.mywife.cc为你推荐
.cn域名cn域名和com域名有什么不同?哪个更好?好在哪里?firetrap我淘宝店还是卖二单就被删,怎么回事!刘祚天还有DJ网么?百度关键词价格查询如何查到推广关键词的价钱?rawtoolsTF卡被写保护了怎么办?www.622hh.comwww.710av.com怎么不可以看了m.2828dy.comwww.dy6868.com这个电影网怎么样?百度指数词为什么百度指数里有写词没有指数,还要购买partnersonline电脑内一切浏览器无法打开baqizi.cc孔融弑母是真的吗?
hkbn 主机 permitrootlogin info域名 日本空间 网盘申请 域名转向 电子邮件服务器 1g内存 云营销系统 路由跟踪 德隆中文网 cdn服务 mteam 新网dns 酷锐 第八届中美互联网论坛 海外加速 reboot alexa搜 更多