spre pagina principală a DMG-Lib
Acasă  · Harta site-ului  · Contact  ·

Căutare avansată   Căutare mecanism

Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction : Riconoscimento audio-visuale della lingua tibetana basata su Deep Dynamic Bayesian Network per una interazione naturale con umani, in: International Journal of Advanced Robotic Systems

thumbnail
Dokument öffnen (benötigt JavaScript)   Deschide documentul

Informaţii generale

Autor Zhao, Yue; Wang, Hui; Ji, Qiang
Publicat  InTech Open Access Publisher, 2012
Ediţie  
Detaliază  
ISBN
Abstract Audio‐visual speech recognition is a natural and
robust approach to improving human‐robot interaction in
noisy environments. Although multi‐stream Dynamic
Bayesian Network and coupled HMM are widely used
for audio‐visual speech recognition, they fail to learn the
shared features between modalities and ignore the
dependency of features among the frames within each
discrete state. In this paper, we propose a Deep Dynamic
Bayesian Network (DDBN) to perform unsupervised
extraction of spatial‐temporal multimodal features from
Tibetan audio‐visual speech data and build an accurate
audio‐visual speech recognition model under a no frameindependency
assumption. The experiment results on
Tibetan speech data from some real‐world environments
showed the proposed DDBN outperforms the state‐of‐art
methods in word recognition accuracy.
Collections
Articole de reviste
2000 şi ulterior
Superordinate work
 
no fulltext found International Journal of Advanced Robotic Systems
Autor: Ottaviano, Erika; Ceccarelli, Marco; Husty, Manfred; Yu, Sung-Hoon; Kim, Yong-Tae; Park, Chang-Woo; Hyun, Chang-Ho; Chen, Xiulong; Feng, Weiming; Sun, Xianyang; Gao, Qing; Grigorescu, Sorin M.; Pozna, Claudiu; Liu, Wanli; Zhankui, Wang; Guo, Meng; Fu, Guoyu; Zhang, Jin; Chen, Wenyuan; Peng, Fengchao; Yang, Pei; Chen, Chunlin; Ding, Rui; Yu, Junzhi; Yang, Qinghai; Tan, Min; Polden, Joseph; Pan, [...]
Publicat: 2004
Linked items
Documents: International Journal of Advanced Robotic Systems
Permanent links
DMG-Lib FaviconDMG-Lib https://www.dmg-lib.org/dmglib/handler?docum=32746009
Europeana FaviconEuropeana  http://www.europeana.eu/portal/record/2020801/dmglib_handler_docum_32746009.html
PDF FaviconPDF  Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction
Data provider
UCAUniv. Cassino  http://webuser.unicas.it/weblarm/larmindex.htm
Administrative information
Time of publication 2012
License information Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License

nach oben sus
×