Alla pagina principale della DMG-Lib
Home  · Mappa del sito  · Contatta  ·

Ricerca avanzata   Ricerca di un meccanismo

Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction : Riconoscimento audio-visuale della lingua tibetana basata su Deep Dynamic Bayesian Network per una interazione naturale con umani, in: International Journal of Advanced Robotic Systems

thumbnail
Dokument öffnen (benötigt JavaScript)   Documenti accessibili

Informazioni generali

Autore Zhao, Yue; Wang, Hui; Ji, Qiang
Pubblicato  InTech Open Access Publisher, 2012
edizione  
Volume  
ISBN
Abstract Audio‐visual speech recognition is a natural and
robust approach to improving human‐robot interaction in
noisy environments. Although multi‐stream Dynamic
Bayesian Network and coupled HMM are widely used
for audio‐visual speech recognition, they fail to learn the
shared features between modalities and ignore the
dependency of features among the frames within each
discrete state. In this paper, we propose a Deep Dynamic
Bayesian Network (DDBN) to perform unsupervised
extraction of spatial‐temporal multimodal features from
Tibetan audio‐visual speech data and build an accurate
audio‐visual speech recognition model under a no frameindependency
assumption. The experiment results on
Tibetan speech data from some real‐world environments
showed the proposed DDBN outperforms the state‐of‐art
methods in word recognition accuracy.
Collections
Articoli a Rivista
2000 ed oltre
Superordinate work
 
no fulltext found International Journal of Advanced Robotic Systems
Autore: Ottaviano, Erika; Ceccarelli, Marco; Husty, Manfred; Yu, Sung-Hoon; Kim, Yong-Tae; Park, Chang-Woo; Hyun, Chang-Ho; Chen, Xiulong; Feng, Weiming; Sun, Xianyang; Gao, Qing; Grigorescu, Sorin M.; Pozna, Claudiu; Liu, Wanli; Zhankui, Wang; Guo, Meng; Fu, Guoyu; Zhang, Jin; Chen, Wenyuan; Peng, Fengchao; Yang, Pei; Chen, Chunlin; Ding, Rui; Yu, Junzhi; Yang, Qinghai; Tan, Min; Polden, Joseph; Pan, [...]
Pubblicato: 2004
Linked items
Documents: International Journal of Advanced Robotic Systems
Permanent links
DMG-Lib FaviconDMG-Lib https://www.dmg-lib.org/dmglib/handler?docum=32746009
Europeana FaviconEuropeana  http://www.europeana.eu/portal/record/2020801/dmglib_handler_docum_32746009.html
PDF FaviconPDF  Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction
Data provider
UCAUniv. Cassino  http://webuser.unicas.it/weblarm/larmindex.htm
Administrative information
Time of publication 2012
License information Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License

×