Semi-supervised training of acoustic models leveraging knowledge transferred from out-of-domain data

Tien Hong Lo, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

2 引文 斯高帕斯(Scopus)

摘要

More recently, a novel objective function of discriminative acoustic model training, namely lattice-free MMI (LF-MMI), has been proposed and achieved the new state-of-the-art in automatic speech recognition (ASR). Although LF-MMI shows excellent performance in a wide array of ASR tasks with supervised training settings, there is a dearth of work on investigating its effectiveness in the scenario of unsupervised or semi-supervised training. On the other hand, semi-supervised (or self-training) of acoustic model suffers from the problem that it is hard to estimate a good model when only a limited amount of correctly transcribed data is made available. It is also generally acknowledged that the performance of discriminative training is vulnerable to correctness of speech transcripts employed for training. In view of the above, this paper explores two novel extensions to LF-MMI. The first one is to distill knowledge (acoustic training statistics) from a large amount of out-of-domain data to better estimate the seed models for use in semi-supervised training. The second one is to make effective selection of the untranscribed target domain data for semi-supervised training. A series of experiments conducted on the AMI benchmark corpus demonstrate the gains from these two extensions are pronounced and additive, which also reveals their effectiveness and viability.

原文英語
主出版物標題2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1400-1404
頁數5
ISBN(電子)9781728132488
DOIs
出版狀態已發佈 - 2019 11月
事件2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 - Lanzhou, 中国
持續時間: 2019 11月 182019 11月 21

出版系列

名字2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

會議

會議2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
國家/地區中国
城市Lanzhou
期間2019/11/182019/11/21

ASJC Scopus subject areas

  • 資訊系統

指紋

深入研究「Semi-supervised training of acoustic models leveraging knowledge transferred from out-of-domain data」主題。共同形成了獨特的指紋。

引用此