to DMG-Lib main page
Home  · Site map  · Contact  ·

Advanced Search   Mechanism Search

Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction : Riconoscimento audio-visuale della lingua tibetana basata su Deep Dynamic Bayesian Network per una interazione naturale con umani, in: International Journal of Advanced Robotic Systems

thumbnail
Dokument öffnen (benötigt JavaScript)   Open document

General information

Author Zhao, Yue; Wang, Hui; Ji, Qiang
Published  InTech Open Access Publisher, 2012
Edition  
Extend  
ISBN
Abstract Audio‐visual speech recognition is a natural and
robust approach to improving human‐robot interaction in
noisy environments. Although multi‐stream Dynamic
Bayesian Network and coupled HMM are widely used
for audio‐visual speech recognition, they fail to learn the
shared features between modalities and ignore the
dependency of features among the frames within each
discrete state. In this paper, we propose a Deep Dynamic
Bayesian Network (DDBN) to perform unsupervised
extraction of spatial‐temporal multimodal features from
Tibetan audio‐visual speech data and build an accurate
audio‐visual speech recognition model under a no frameindependency
assumption. The experiment results on
Tibetan speech data from some real‐world environments
showed the proposed DDBN outperforms the state‐of‐art
methods in word recognition accuracy.
Collections
Journal articles
2000 and later
Superordinate work
 
no fulltext found International Journal of Advanced Robotic Systems
Author: Ottaviano, Erika; Ceccarelli, Marco; Husty, Manfred; Yu, Sung-Hoon; Kim, Yong-Tae; Park, Chang-Woo; Hyun, Chang-Ho; Chen, Xiulong; Feng, Weiming; Sun, Xianyang; Gao, Qing; Grigorescu, Sorin M.; Pozna, Claudiu; Liu, Wanli; Zhankui, Wang; Guo, Meng; Fu, Guoyu; Zhang, Jin; Chen, Wenyuan; Peng, Fengchao; Yang, Pei; Chen, Chunlin; Ding, Rui; Yu, Junzhi; Yang, Qinghai; Tan, Min; Polden, Joseph; Pan, [...]
Published: 2004
Linked items
Documents: International Journal of Advanced Robotic Systems
Permanent links
DMG-Lib FaviconDMG-Lib https://www.dmg-lib.org/dmglib/handler?docum=32746009
Europeana FaviconEuropeana  http://www.europeana.eu/portal/record/2020801/dmglib_handler_docum_32746009.html
PDF FaviconPDF  Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction
Data provider
UCAUniv. Cassino  http://webuser.unicas.it/weblarm/larmindex.htm
Administrative information
Time of publication 2012
License information Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License

Library Collection

Browse by letter of mechansim title

×