zur Startseite der DMG-Lib
Home  · Übersicht  · Kontakt  ·

Erweiterte Suche   Mechanismensuche

Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction : Riconoscimento audio-visuale della lingua tibetana basata su Deep Dynamic Bayesian Network per una interazione naturale con umani, in: International Journal of Advanced Robotic Systems

thumbnail
Dokument öffnen (benötigt JavaScript)   Dokument öffnen

Allgemeine Angaben

Autor Zhao, Yue; Wang, Hui; Ji, Qiang
Erschienen  InTech Open Access Publisher, 2012
Ausgabe  
Umfang  
ISBN
Kurzbeschreibung Audio‐visual speech recognition is a natural and
robust approach to improving human‐robot interaction in
noisy environments. Although multi‐stream Dynamic
Bayesian Network and coupled HMM are widely used
for audio‐visual speech recognition, they fail to learn the
shared features between modalities and ignore the
dependency of features among the frames within each
discrete state. In this paper, we propose a Deep Dynamic
Bayesian Network (DDBN) to perform unsupervised
extraction of spatial‐temporal multimodal features from
Tibetan audio‐visual speech data and build an accurate
audio‐visual speech recognition model under a no frameindependency
assumption. The experiment results on
Tibetan speech data from some real‐world environments
showed the proposed DDBN outperforms the state‐of‐art
methods in word recognition accuracy.
Sammlungen
Zeitschriftenartikel
ab 2000
Übergeordnete Werke
 
no fulltext found International Journal of Advanced Robotic Systems
Autor: Ottaviano, Erika; Ceccarelli, Marco; Husty, Manfred; Yu, Sung-Hoon; Kim, Yong-Tae; Park, Chang-Woo; Hyun, Chang-Ho; Chen, Xiulong; Feng, Weiming; Sun, Xianyang; Gao, Qing; Grigorescu, Sorin M.; Pozna, Claudiu; Liu, Wanli; Zhankui, Wang; Guo, Meng; Fu, Guoyu; Zhang, Jin; Chen, Wenyuan; Peng, Fengchao; Yang, Pei; Chen, Chunlin; Ding, Rui; Yu, Junzhi; Yang, Qinghai; Tan, Min; Polden, Joseph; Pan, [...]
Erschienen: 2004
Verknüpfte Datensätze
Dokumente: International Journal of Advanced Robotic Systems
Permanentlinks
DMG-Lib FaviconDMG-Lib https://www.dmg-lib.org/dmglib/handler?docum=32746009
Europeana FaviconEuropeana  http://www.europeana.eu/portal/record/2020801/dmglib_handler_docum_32746009.html
PDF FaviconPDF  Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction
Datenbereitsteller
UCAUniv. Cassino  http://webuser.unicas.it/weblarm/larmindex.htm
Verwaltungsinformationen
Publikationsdatum 2012
Lizenzinformation Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License

×