Training data selection for improving discriminative training of acoustic models

Shih Hung Liu*, Fang Hui Chu, Shih Hsiang Lin, Hung Shin Lee, Berlin Chen

*此作品的通信作者

研究成果: 會議貢獻類型會議論文同行評審

13 引文 斯高帕斯(Scopus)

摘要

This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterancelevel data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone- and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.

原文英語
頁面284-289
頁數6
出版狀態已發佈 - 2007
事件2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 - Kyoto, 日本
持續時間: 2007 12月 92007 12月 13

其他

其他2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007
國家/地區日本
城市Kyoto
期間2007/12/092007/12/13

ASJC Scopus subject areas

  • 電腦視覺和模式識別
  • 軟體
  • 人工智慧

指紋

深入研究「Training data selection for improving discriminative training of acoustic models」主題。共同形成了獨特的指紋。

引用此