共振器googlevoice

googlevoice  时间:2021-01-11  阅读:()
Chapter1IntroductiontoSpeechSignalProcessing语音信号处理概述1OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing2TheSpeechSignalSpeech(语音)isthevocalized(有声的)formofhumancommunicationThefundamentalpurposeofspeechishumancommunication;i.
e.
,thetransmissionofmessages(信息)betweenaspeakerandalistenerThefundamentalanalogformofthemessageisanacousticwaveform(声学波形)thatwecallthespeechsignal(语音信号)Speechsignalscanbe–convertedtoanelectricalwaveformbyamicrophone–manipulatedbyanalog/digitalsignalprocessing–convertedbacktoacousticformbyaloudspeaker/headphone3TheSpeechSignal4SoftwarePraat–http://www.
fon.
hum.
uva.
nl/praat/CoolEditPro(AdobeAudition)5SpeechSignalProcessingSpeechSignalProcessing(语音信号处理)–convertingonetypeofspeechsignalrepresentationtoanothersoastouncovervariousmathematicalorpracticalpropertiesofthespeechsignal(发掘语音特征)anddoappropriateprocessingtoaidinsolvingbothfundamentalanddeepproblemsofinterest(解决实际问题)Purposeofspeechsignalprocessing–Tounderstandspeechasameansofcommunication–Torepresentspeechfortransmissionandreproduction–Toanalyzespeechforautomaticrecognitionandextractionofinformation–Todiscoversomephysiologicalcharacteristicsofthetalker6SpeechSignalProcessingDigitalprocessingofspeechsignal(数字语音信号处理,DPSS)–obtainingdiscreterepresentationsofspeechsignal,whichpreservestheinformationcontentinthespeechsignal,alsoitisconvenientfortransmissionorstorage–theory,designandimplementationofnumericalprocedures(algorithms)forprocessingthediscreterepresentationinordertoachieveagoal(recognizingthesignal,modifyingthetimescaleofthesignal,removingbackgroundnoisefromthesignal,etc.
)7SpeechSignalProcessingAdvantagesofDPSS–reliability–flexibility–accuracy–real-timeimplementationsoninexpensiveDSPchips–abilitytointegratewithmultimediaanddata–encryptability/securityofthedataandthedatarepresentationsviasuitabletechniques8OutlineTheSpeechSignalSpeechSignalProcessingSpeechProduction/PerceptionModelandtheSpeechChainTheSpeechStackApplicationsofSpeechSignalProcessingHistoryofSpeechSignalProcessing9SpeechProductionModelMessageFormulation信息形成–desiretocommunicateanidea,awish,arequest,…expressthemessageasasequenceofwords10SpeechProductionModelLanguageCode语言编码–needtoconvertchosentextstringtoasequenceofsoundsinthelanguagethatcanbeunderstoodbyothers–needtogivesomeformofemphasis,prosody(tune,melody)tothespokensoundssoastoimpartnon-speechinformationsuchassenseofurgency,importance,psychologicalstateoftalker,environmentalfactors(noise,echo)11SpeechProductionModelNeuro-MuscularControls神经-肌肉控制–needtodirecttheneuro-muscularsystemtomovethearticulators(发音器官)(tongue,lips,teeth,jaws,velum(软腭))soastoproducethedesiredspokenmessageinthedesiredmanner12SpeechProductionModelVocalTract(声道)System–needtoshapethehumanvocaltractsystemandprovidetheappropriatesoundsourcestocreateanacousticwaveform(speech)thatisunderstandableintheenvironmentinwhichitisspoken13SpeechPerceptionModelTheacousticwaveformimpinges(冲击)ontheear(thebasilarmembrane(基底膜))andisspectrallyanalyzedbyanequivalentfilterbank(滤波器组)oftheearThesignalfromthebasilarmembraneisneurallytransducedandcodedintofeaturesthatcanbedecodedbythebrain14SpeechPerceptionModelThebraindecodesthefeaturestreamintosounds,wordsandsentencesThebraindeterminesthemeaningofthewordsviaamessageunderstandingmechanism15TheSpeechChain16Goal:FindoutifyourofficematehashadlunchText:"Didyoueatyet"Phonemes:"didyuityt"ArticulatorDynamics:dIjitjtInformationRateofSpeechText(discrete)–2^5symbols,10symbols/s->50bpsPhonemes&Prosody(discrete)–200bpsArticulatorymotions(continuous)–Relativelyslowmovementofarticulators~2000bpsAcousticwaveform(continuous)–64,000bps~705,600bps17TheSpeechStack18SpeechScience(语音科学)Linguistics(语言学):scienceoflanguage,includingsyntax,semantics,phonetics,phonology,etc.
Syntax(句法,语法):analysisanddescriptionofthegrammaticalstructureofabodyoftextualmaterialSemantics(语义学):analysisanddescriptionofthemeaningofabodyoftextualmaterialanditsrelationshiptoataskdescriptionofthelanguagePhonetics(语音学):studyofspeechsoundsandtheirproduction,transmission,andperception,andtheiranalysis,classification,andtranscription–Articulatory/Acoustic/AuditoryPhoneticsPhonology(音系学):systematicorganizationofsoundsinlanguages,systemsofphonemesinparticularlanguagesPhonemes(音位,音素):smallestsetofunitsconsideredtobethebasicsetofdistinctivesoundsofalanguages(20-60unitsformostlanguages)ApplicationsofSpeechSignalProcessingSpeechcoding(语音编码)Speechsynthesis(语音合成)Speechrecognitionandunderstanding(语音识别与理解)Otherspeechapplications20SpeechCodingTheprocessoftransformingaspeechsignalintoarepresentationforefficienttransmissionandstorageofspeech–narrowbandandbroadbandwiredtelephony–cellularcommunications–VoiceoverIP(VoIP)toutilizetheInternetasareal-timecommunicationsmedium–securevoiceforprivacyandencryptionfornationalsecurityapplications–extremelynarrowbandcommunicationschannels,e.
g.
,battlefieldapplicationsusingHFradio–storageofspeechfortelephoneansweringmachines,IVRsystems,prerecordedmessages21SpeechCoding22ApplicationsofSpeechSignalProcessing23SpeechSynthesisTheprocessofgeneratingaspeechsignalusingcomputationalmeansforeffectivehuman-machineinteractions–machinereadingoftextoremailmessages–telematicsfeedbackinautomobiles–talkingagentsforautomatictransactions–automaticagentincustomercarecallcenter–handhelddevicessuchasforeignlanguagephrasebooks,dictionaries,crosswordpuzzlehelpers–announcementmachinesthatprovideinformationsuchasstockquotes,airlines–schedules,weatherreports,etc.
24SpeechSynthesis25SpeechRecognitionandUnderstandingTheprocessofextractingusablelinguisticinformationfromaspeechsignalinsupportofhuman-machinecommunicationbyvoice–commandandcontrol(C&C)applications,e.
g.
,simplecommandsforspreadsheets,presentationgraphics,appliances–voicedictationtocreateletters,memos,andotherdocuments–naturallanguagevoicedialogueswithmachinestoenableHelpdesks,CallCenters–voicedialingforcellphonesandfromPDA'sandothersmalldevices–agentservicessuchascalendarentryandupdate,addresslistmodificationandentry,etc.
26PatternMatchingProblems27OtherSpeechApplicationsSpeakerVerification(话者确认)–forsecureaccesstopremises,information,virtualspacesSpeakerRecognition(话者识别)–forlegalandforensicpurposes—nationalsecurity;alsoforpersonalizedservicesSpeechEnhancement(语音增强)–foruseinnoisyenvironments,toeliminateecho,toalignvoiceswithvideosegments,tochangevoicequalities,tospeed-uporslow-downprerecordedspeech(e.
g.
,talkingbooks,rapidreviewofmaterial,carefulscrutinizingofspokenmaterial,etc)–potentiallytoimproveintelligibilityandnaturalnessofspeechLanguageTranslation(语言翻译)–toconvertspokenwordsinonelanguagetoanothertofacilitatenaturallanguagedialoguesbetweenpeoplespeakingdifferentlanguages,i.
e.
,tourists,businesspeople28HistoryofSpeechSignalProcessing29HistoryofSpeechSignalProcessingInventionoftelephone,Bell1876–"Watson,ifIcangetamechanismwhichwillmakeacurrentofelectricityvaryitsintensityastheairvariesindensitywhensoundispassingthroughit,Icantelegraphanysound,eventhesoundofspeech"30HistoryofSpeechSignalProcessingVOCODERandVODER,Dudley–VOCODER(VOiceenCODER)声码器amethodofreproducingspeechthroughelectronicmeanssource-filtermodeluseparallelband-passfiltertofilterspeechintotenspecificaudiospectrumbands,renderingitmoreeasilytransmittedovertelephonelines–VODER(VoiceOperationDEmonstratoR)aconsolefromwhichanoperatorcouldcreatephrasesofspeechcontrollingaVOCODERwithakeyboardandfootpedals(踏板)1939WorldFairinNYC31VODERVODERSoundSpectrograph(语谱仪),BellLab,1947PatternPlayback,HaskinsLab,1950DigitRecognizer,BellLabs,195236DigitPatternTheideawastotrackthefirsttwoformants.
1960-70'sFant,"AcousticTheoryofSpeechProduction",1970BreakthroughinDSPsincethemid1960'–1965FFT–1968HomomorphicProcessing(同态处理)–mid1970'sLinearPredictionAnalysis(线性预测分析)–late1970'sVectorQuantization(矢量量化)Patternmatchingtechniques–1970'sDynamicTimeWarping(动态时间规整)WidelyapplicationofcomputersDARPAstartedSpeechUnderstandingResearch(SUR)programin1970's38Since1980'sSpeechCoding–1980LPC-102.
4kbps–1988FS-10164.
8kbps–1990'sMBE2.
4kbps–ITU-TG-seriesstandard,model-basedVOCODER39Since1980'sSpeechsynthesis–1980Klattcascade/parallelformantsynthesizer–Waveformconcatenationrule-based,TD-PSOLAcorpus-based,unitselection–HMM-basedparametricspeechsynthesis4142第一共振器第二共振器第三共振器第四共振器第五共振器第一共振器第二共振器第二共振器第三共振器第三共振器第四共振器第四共振器第五共振器第五共振器第六共振器++++鼻共振器气管共振器鼻共振器一阶差分滤波脉冲链KLATT声源谱斜率修正L.
F.
声源送气声源擦音噪声源喉声源喉声源串联声道喉声源并联声道(一般不用)擦音噪声源并联声道F0AVOQFLDISQSSTLAHFNPFNZBNPBNZFTPFTZBTPBTZF1B1DF1BF1F2B2F3B3F4B4F5B4CPA2FA3FA4FA5FA6FABANVA1VA2VA3VA4VA5V全通语音输出KlattSynthesizer年份1995年1998年1999年2001年2003年自然度<3.
03.
03.
53.
84.
3STOPWaveformConcatenationSynthesis-iFLYTEKSince1980'sSpeechrecognition–HMM-basedStatisticalpatternrecognitionframework–DevelopmentofVLSIandcomputertechnology–Speechrecognitionsystems1985IBM"Tangora",isolated-wordspeechrecognizer1990IBM"DragonDictate",firstlarge-vocabularyspeech-to-textsystemforgeneral-purposedictation1990'sCMU"Sphinx",continuous-speech,speaker-independentrecognitionsystem1997IBM"ViaVoice"44451997年9月发布Viavoice语音识别软件中文版,从上个世纪70年代开始进行语音技术研究2007-2010年先后发布电话语音搜索,互联网移动语音搜索,GoogleVoiceAction2010年4月收购语音服务提供商Siri,宣布将在iPhone中提供智能语音服务2007年3月以8亿美金价格收购语音搜索业务公司TellMe,加大对语音技术投入2009年10月微软发布WIN7操作系统,集成语音识别技术464748GoogleDuplexGoogleDuplexWhatWeWillBeLearningreviewsomebasicDSPconceptsspeechproductionmodel—acoustics,articulatoryconcepts,speechproductionmodelsspeechperceptionmodel—earmodels,auditorysignalprocessingtimedomainprocessingconcepts—speechproperties,pitch,voiced-unvoiced,energy,autocorrelation,zero-crossingratesshorttimeFourieranalysismethods—digitalfilterbanks,spectrograms,analysis-synthesissystems,vocodershomomorphicspeechprocessing—cepstrum,pitchdetection,formantestimation,homomorphicvocoderlinearpredictivecodingmethods—autocorrelationmethod,covariancemethod,latticemethods,relationtovocaltractmodelsspeechwaveformcodingandsourcemodels—deltamodulation,PCM,mu-law,ADPCM,vectorquantization,multipulsecoding,CELPcodingmethodsforspeechsynthesisandtext-to-speechsystems—physicalmodels,formantmodels,articulatorymodels,concatenativemodelsmethodsforspeechrecognition—theHiddenMarkovModel(HMM)51

iWebFusion:独立服务器月付57美元起/5个机房可选,10Gbps服务器月付149美元起

iWebFusion(iWFHosting)在部落分享过很多次了,这是成立于2001年的老牌国外主机商H4Y旗下站点,提供的产品包括虚拟主机、VPS和独立服务器租用等等,其中VPS主机基于KVM架构,数据中心可选美国洛杉矶、北卡、本德、蒙蒂塞洛等。商家独立服务器可选5个不同机房,最低每月57美元起,而大流量10Gbps带宽服务器也仅149美元起。首先我们分享几款常规服务器配置信息,以下机器可选择5...

数脉科技:香港服务器低至350元/月;阿里云CN2+BGP线路,带宽10M30M50M100M

数脉科技(shuhost)8月促销:香港独立服务器,自营BGP、CN2+BGP、阿里云线路,新客立减400港币/月,老用户按照优惠码减免!香港服务器带宽可选10Mbps、30Mbps、50Mbps、100Mbps带宽,支持中文本Windows、Linux等系统。数脉香港特价阿里云10MbpsCN2,e3-1230v2,16G内存,1T HDD 或 240G SSD,10Mbps带宽,IPv41个,...

美国高防云服务器 1核 1G 26元/月 香港/日本站群服务器 E5 16G 1600元/月 触摸云

触摸云国内IDC/ISP资质齐全商家,与香港公司联合运营, 已超8年运营 。本次为大家带来的是双12特惠活动,美国高防|美国大宽带买就可申请配置升档一级[CPU内存宽带流量选一]升档方式:CPU内存宽带流量任选其一,工单申请免费升级一档珠海触摸云科技有限公司官方网站:https://cmzi.com/可新购免费升档配置套餐:地区CPU内存带宽数据盘价格购买地址美国高防 1核 1G10M20G 26...

googlevoice为你推荐
注册国际域名哪里的国际域名注册便宜?vps主机vps主机用途有哪些?海外主机如何选择优质的海外主机?域名注册查询如何查域名有没有被注册网站服务器租用公司想建个网站,请问租服务器按年收费是多少钱重庆虚拟空间重庆合川宝龙城市广场有前途么网站空间购买在哪里购买网站空间虚拟主机管理系统什么虚拟主机管理系统支持W和linux操作系统域名停靠域名停靠是什么啊? 谁能告诉我谢谢!域名交易域名如何买卖??
asp网站空间 怎样注册域名 查询ip地址 域名商 site5 韩国俄罗斯 winhost 外贸主机 directadmin 警告本网站 eq2 个人域名 193邮箱 美国堪萨斯 免费私人服务器 免费邮件服务器 独立主机 免费网络 贵阳电信 阿里云邮箱登陆地址 更多