共振器googlevoice

googlevoice  时间:2021-01-11  阅读:()
Chapter1IntroductiontoSpeechSignalProcessing语音信号处理概述1OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing2TheSpeechSignalSpeech(语音)isthevocalized(有声的)formofhumancommunicationThefundamentalpurposeofspeechishumancommunication;i.
e.
,thetransmissionofmessages(信息)betweenaspeakerandalistenerThefundamentalanalogformofthemessageisanacousticwaveform(声学波形)thatwecallthespeechsignal(语音信号)Speechsignalscanbe–convertedtoanelectricalwaveformbyamicrophone–manipulatedbyanalog/digitalsignalprocessing–convertedbacktoacousticformbyaloudspeaker/headphone3TheSpeechSignal4SoftwarePraat–http://www.
fon.
hum.
uva.
nl/praat/CoolEditPro(AdobeAudition)5SpeechSignalProcessingSpeechSignalProcessing(语音信号处理)–convertingonetypeofspeechsignalrepresentationtoanothersoastouncovervariousmathematicalorpracticalpropertiesofthespeechsignal(发掘语音特征)anddoappropriateprocessingtoaidinsolvingbothfundamentalanddeepproblemsofinterest(解决实际问题)Purposeofspeechsignalprocessing–Tounderstandspeechasameansofcommunication–Torepresentspeechfortransmissionandreproduction–Toanalyzespeechforautomaticrecognitionandextractionofinformation–Todiscoversomephysiologicalcharacteristicsofthetalker6SpeechSignalProcessingDigitalprocessingofspeechsignal(数字语音信号处理,DPSS)–obtainingdiscreterepresentationsofspeechsignal,whichpreservestheinformationcontentinthespeechsignal,alsoitisconvenientfortransmissionorstorage–theory,designandimplementationofnumericalprocedures(algorithms)forprocessingthediscreterepresentationinordertoachieveagoal(recognizingthesignal,modifyingthetimescaleofthesignal,removingbackgroundnoisefromthesignal,etc.
)7SpeechSignalProcessingAdvantagesofDPSS–reliability–flexibility–accuracy–real-timeimplementationsoninexpensiveDSPchips–abilitytointegratewithmultimediaanddata–encryptability/securityofthedataandthedatarepresentationsviasuitabletechniques8OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing9SpeechProductionModelMessageFormulation信息形成–desiretocommunicateanidea,awish,arequest,…expressthemessageasasequenceofwords10SpeechProductionModelLanguageCode语言编码–needtoconvertchosentextstringtoasequenceofsoundsinthelanguagethatcanbeunderstoodbyothers–needtogivesomeformofemphasis,prosody(tune,melody)tothespokensoundssoastoimpartnon-speechinformationsuchassenseofurgency,importance,psychologicalstateoftalker,environmentalfactors(noise,echo)11SpeechProductionModelNeuro-MuscularControls神经-肌肉控制–needtodirecttheneuro-muscularsystemtomovethearticulators(发音器官)(tongue,lips,teeth,jaws,velum(软腭))soastoproducethedesiredspokenmessageinthedesiredmanner12SpeechProductionModelVocalTract(声道)System–needtoshapethehumanvocaltractsystemandprovidetheappropriatesoundsourcestocreateanacousticwaveform(speech)thatisunderstandableintheenvironmentinwhichitisspoken13SpeechPerceptionModelTheacousticwaveformimpinges(冲击)ontheear(thebasilarmembrane(基底膜))andisspectrallyanalyzedbyanequivalentfilterbank(滤波器组)oftheearThesignalfromthebasilarmembraneisneurallytransducedandcodedintofeaturesthatcanbedecodedbythebrain14SpeechPerceptionModelThebraindecodesthefeaturestreamintosounds,wordsandsentencesThebraindeterminesthemeaningofthewordsviaamessageunderstandingmechanism15TheSpeechChain16Goal:FindoutifyourofficematehashadlunchText:"Didyoueatyet"Phonemes:"didyuityt"ArticulatorDynamics:dIjitjtInformationRateofSpeechText(discrete)–2^5symbols,10symbols/s->50bpsPhonemes&Prosody(discrete)–200bpsArticulatorymotions(continuous)–Relativelyslowmovementofarticulators~2000bpsAcousticwaveform(continuous)–64,000bps~705,600bps17TheSpeechStack18SpeechScience(语音科学)Linguistics(语言学):scienceoflanguage,includingsyntax,semantics,phonetics,phonology,etc.
Syntax(句法,语法):analysisanddescriptionofthegrammaticalstructureofabodyoftextualmaterialSemantics(语义学):analysisanddescriptionofthemeaningofabodyoftextualmaterialanditsrelationshiptoataskdescriptionofthelanguagePhonetics(语音学):studyofspeechsoundsandtheirproduction,transmission,andperception,andtheiranalysis,classification,andtranscription–Articulatory/Acoustic/AuditoryPhoneticsPhonology(音系学):systematicorganizationofsoundsinlanguages,systemsofphonemesinparticularlanguagesPhonemes(音位,音素):smallestsetofunitsconsideredtobethebasicsetofdistinctivesoundsofalanguages(20-60unitsformostlanguages)ApplicationsofSpeechSignalProcessingSpeechcoding(语音编码)Speechsynthesis(语音合成)Speechrecognitionandunderstanding(语音识别与理解)Otherspeechapplications20SpeechCodingTheprocessoftransformingaspeechsignalintoarepresentationforefficienttransmissionandstorageofspeech–narrowbandandbroadbandwiredtelephony–cellularcommunications–VoiceoverIP(VoIP)toutilizetheInternetasareal-timecommunicationsmedium–securevoiceforprivacyandencryptionfornationalsecurityapplications–extremelynarrowbandcommunicationschannels,e.
g.
,battlefieldapplicationsusingHFradio–storageofspeechfortelephoneansweringmachines,IVRsystems,prerecordedmessages21SpeechCoding22ApplicationsofSpeechSignalProcessing23SpeechSynthesisTheprocessofgeneratingaspeechsignalusingcomputationalmeansforeffectivehuman-machineinteractions–machinereadingoftextoremailmessages–telematicsfeedbackinautomobiles–talkingagentsforautomatictransactions–automaticagentincustomercarecallcenter–handhelddevicessuchasforeignlanguagephrasebooks,dictionaries,crosswordpuzzlehelpers–announcementmachinesthatprovideinformationsuchasstockquotes,airlines–schedules,weatherreports,etc.
24SpeechSynthesis25SpeechRecognitionandUnderstandingTheprocessofextractingusablelinguisticinformationfromaspeechsignalinsupportofhuman-machinecommunicationbyvoice–commandandcontrol(C&C)applications,e.
g.
,simplecommandsforspreadsheets,presentationgraphics,appliances–voicedictationtocreateletters,memos,andotherdocuments–naturallanguagevoicedialogueswithmachinestoenableHelpdesks,CallCenters–voicedialingforcellphonesandfromPDA'sandothersmalldevices–agentservicessuchascalendarentryandupdate,addresslistmodificationandentry,etc.
26PatternMatchingProblems27OtherSpeechApplicationsSpeakerVerification(话者确认)–forsecureaccesstopremises,information,virtualspacesSpeakerRecognition(话者识别)–forlegalandforensicpurposes—nationalsecurity;alsoforpersonalizedservicesSpeechEnhancement(语音增强)–foruseinnoisyenvironments,toeliminateecho,toalignvoiceswithvideosegments,tochangevoicequalities,tospeed-uporslow-downprerecordedspeech(e.
g.
,talkingbooks,rapidreviewofmaterial,carefulscrutinizingofspokenmaterial,etc)–potentiallytoimproveintelligibilityandnaturalnessofspeechLanguageTranslation(语言翻译)–toconvertspokenwordsinonelanguagetoanothertofacilitatenaturallanguagedialoguesbetweenpeoplespeakingdifferentlanguages,i.
e.
,tourists,businesspeople28HistoryofSpeechSignalProcessing29HistoryofSpeechSignalProcessingInventionoftelephone,Bell1876–"Watson,ifIcangetamechanismwhichwillmakeacurrentofelectricityvaryitsintensityastheairvariesindensitywhensoundispassingthroughit,Icantelegraphanysound,eventhesoundofspeech"30HistoryofSpeechSignalProcessingVOCODERandVODER,Dudley–VOCODER(VOiceenCODER)声码器amethodofreproducingspeechthroughelectronicmeanssource-filtermodeluseparallelband-passfiltertofilterspeechintotenspecificaudiospectrumbands,renderingitmoreeasilytransmittedovertelephonelines–VODER(VoiceOperationDEmonstratoR)aconsolefromwhichanoperatorcouldcreatephrasesofspeechcontrollingaVOCODERwithakeyboardandfootpedals(踏板)1939WorldFairinNYC31VODERVODERSoundSpectrograph(语谱仪),BellLab,1947PatternPlayback,HaskinsLab,1950DigitRecognizer,BellLabs,195236DigitPatternTheideawastotrackthefirsttwoformants.
1960-70'sFant,"AcousticTheoryofSpeechProduction",1970BreakthroughinDSPsincethemid1960'–1965FFT–1968HomomorphicProcessing(同态处理)–mid1970'sLinearPredictionAnalysis(线性预测分析)–late1970'sVectorQuantization(矢量量化)Patternmatchingtechniques–1970'sDynamicTimeWarping(动态时间规整)WidelyapplicationofcomputersDARPAstartedSpeechUnderstandingResearch(SUR)programin1970's38Since1980'sSpeechCoding–1980LPC-102.
4kbps–1988FS-10164.
8kbps–1990'sMBE2.
4kbps–ITU-TG-seriesstandard,model-basedVOCODER39Since1980'sSpeechsynthesis–1980Klattcascade/parallelformantsynthesizer–Waveformconcatenationrule-based,TD-PSOLAcorpus-based,unitselection–HMM-basedparametricspeechsynthesis4142第一共振器第二共振器第三共振器第四共振器第五共振器第一共振器第二共振器第二共振器第三共振器第三共振器第四共振器第四共振器第五共振器第五共振器第六共振器++++鼻共振器气管共振器鼻共振器一阶差分滤波脉冲链KLATT声源谱斜率修正L.
F.
声源送气声源擦音噪声源喉声源喉声源串联声道喉声源并联声道(一般不用)擦音噪声源并联声道F0AVOQFLDISQSSTLAHFNPFNZBNPBNZFTPFTZBTPBTZF1B1DF1BF1F2B2F3B3F4B4F5B4CPA2FA3FA4FA5FA6FABANVA1VA2VA3VA4VA5V全通语音输出KlattSynthesizer年份1995年1998年1999年2001年2003年自然度<3.
03.
03.
53.
84.
3STOPWaveformConcatenationSynthesis-iFLYTEKSince1980'sSpeechrecognition–HMM-basedStatisticalpatternrecognitionframework–DevelopmentofVLSIandcomputertechnology–Speechrecognitionsystems1985IBM"Tangora",isolated-wordspeechrecognizer1990IBM"DragonDictate",firstlarge-vocabularyspeech-to-textsystemforgeneral-purposedictation1990'sCMU"Sphinx",continuous-speech,speaker-independentrecognitionsystem1997IBM"ViaVoice"44451997年9月发布Viavoice语音识别软件中文版,从上个世纪70年代开始进行语音技术研究2007-2010年先后发布电话语音搜索,互联网移动语音搜索,GoogleVoiceAction2010年4月收购语音服务提供商Siri,宣布将在iPhone中提供智能语音服务2007年3月以8亿美金价格收购语音搜索业务公司TellMe,加大对语音技术投入2009年10月微软发布WIN7操作系统,集成语音识别技术464748GoogleDuplexGoogleDuplexWhatWeWillBeLearningreviewsomebasicDSPconceptsspeechproductionmodel—acoustics,articulatoryconcepts,speechproductionmodelsspeechperceptionmodel—earmodels,auditorysignalprocessingtimedomainprocessingconcepts—speechproperties,pitch,voiced-unvoiced,energy,autocorrelation,zero-crossingratesshorttimeFourieranalysismethods—digitalfilterbanks,spectrograms,analysis-synthesissystems,vocodershomomorphicspeechprocessing—cepstrum,pitchdetection,formantestimation,homomorphicvocoderlinearpredictivecodingmethods—autocorrelationmethod,covariancemethod,latticemethods,relationtovocaltractmodelsspeechwaveformcodingandsourcemodels—deltamodulation,PCM,mu-law,ADPCM,vectorquantization,multipulsecoding,CELPcodingmethodsforspeechsynthesisandtext-to-speechsystems—physicalmodels,formantmodels,articulatorymodels,concatenativemodelsmethodsforspeechrecognition—theHiddenMarkovModel(HMM)51

VirMach:$7.2/年KVM-美元512MB/$7.2/年MB多个机房个机房可选_双线服务器租赁

Virmach对资源限制比较严格,建议查看TOS,自己做好限制,优点是稳定。 vCPU 内存 空间 流量 带宽 IPv4 价格 购买 1 512MB 15GB SSD 500GB 1Gbps 1 $7/VirMach:$7/年/512MB内存/15GB SSD空间/500GB流量/1Gbps端口/KVM/洛杉矶/西雅图/芝加哥/纽约等 发布于 5个月前 (01-05) VirMach,美国老牌、稳...

Hostodo:$19.99/年KVM-1GB/12GB/4TB/拉斯维加斯

Hostodo发布了几款采用NVMe磁盘的促销套餐,从512MB内存起,最低年付14.99美元,基于KVM架构,开设在拉斯维加斯机房。这是一家成立于2014年的国外VPS主机商,主打低价VPS套餐且年付为主,基于OpenVZ和KVM架构,产品性能一般,数据中心目前在拉斯维加斯和迈阿密,支持使用PayPal或者支付宝等付款方式。下面列出几款NVMe硬盘套餐配置信息。CPU:1core内存:512MB...

Friendhosting四五折促销,VPS半年付7.5欧元起

Friendhosting发布了针对“系统管理日”(每年7月的最后一个星期五)的优惠活动,针对VPS主机提供55%的优惠(相当于四五折),支持1-6个月付款使用,首付折扣非永久,优惠后最低套餐首半年7.18欧元起。这是一家保加利亚主机商,成立于2009年4月,商家提供VDS和独立服务器租用等,数据中心目前可选美国洛杉矶、保加利亚、乌克兰、荷兰、拉脱维亚、捷克和波兰等8个地区机房。下面以最低套餐为例...

googlevoice为你推荐
中文域名注册查询域名还分中文和英文的吗,在哪里可以查到中文域名到期了?英文域名中文域名和英文域名有什么区别,越具体越好国内ip代理谁给我几个北京或国内的IP代理啊,高分,能用的美国服务器托管美国服务器租用有哪些系列?虚拟空间免费试用那位给我介绍个可以试用三天的虚拟空间。网站空间商域名空间商怎么做100m网站空间网站空间100M指多大虚拟主机控制面板虚拟主机控制面板是什么?万网虚拟主机万网虚拟、专享、独享主机有什么区别?解析域名什么是域名解析?如何解析?
中文域名申请 拜登买域名批特朗普 budgetvm vpsio 12306抢票攻略 typecho 湖南服务器托管 老左来了 tna官网 qq对话框 免费phpmysql空间 hktv drupal安装 申请免费空间和域名 中国电信测速器 免费asp空间 智能dns解析 帽子云排名 镇江高防 群英网络 更多