Training data selection for improving discriminative training of acoustic models

Berlin Chen*, Shih Hung Liu, Fang Hui Chu

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

10 引文 斯高帕斯(Scopus)

摘要

This paper considers training data selection for discriminative training of acoustic models for large vocabulary continuous speech recognition (LVCSR). Three novel data selection approaches are proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance is utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice is investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice is explored. The underlying characteristics of the presented approaches are extensively investigated and their performance is verified by comparison with standard discriminative training approaches. Experiments conducted on a broadcast news speech transcription task show that with the aid of phone- and frame-level data selection we can reduce more than half of the turnaround time for acoustic model training and simultaneously obtain a comparably good set of discriminative acoustic models.

原文英語
頁(從 - 到)1228-1235
頁數8
期刊Pattern Recognition Letters
30
發行號13
DOIs
出版狀態已發佈 - 2009 10月 1

ASJC Scopus subject areas

  • 軟體
  • 訊號處理
  • 電腦視覺和模式識別
  • 人工智慧

指紋

深入研究「Training data selection for improving discriminative training of acoustic models」主題。共同形成了獨特的指紋。

引用此