Lightly supervised and data-driven approaches to Mandarin broadcast news transcription

Berlin Chen*, Jen Wei Kuo, Wen Hung Tsai

*此作品的通信作者

研究成果: 雜誌貢獻會議論文同行評審

37 引文 斯高帕斯(Scopus)

摘要

This paper investigates the use of several lightly supervised and data-driven approaches to Mandarin broadcast news transcription. First, with a consideration of the special structural properties of the Chinese language, a fast acoustic look-ahead technique for estimating the unexplored part of speech utterance was integrated into the lexical tree search to improve the search efficiency, in conjunction with the conventional language model look-ahead technique. Then, a verification-based method for automatic acoustic training data acquisition was developed to make use of the large amount of untranscribed speech data. Finally, two alternative strategies for language model adaptation were further studied for accurate language model estimation. With the above approaches, the system yielded an 11.94% character error rate on the Mandarin broadcast news collected in Taiwan.

原文英語
頁(從 - 到)I777-I780
期刊ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
1
出版狀態已發佈 - 2004
事件Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, 加拿大
持續時間: 2004 5月 172004 5月 21

ASJC Scopus subject areas

  • 軟體
  • 訊號處理
  • 電氣與電子工程

指紋

深入研究「Lightly supervised and data-driven approaches to Mandarin broadcast news transcription」主題。共同形成了獨特的指紋。

引用此