Semi-supervised training of acoustic models leveraging knowledge transferred from out-of-domain data

Tien Hong Lo, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

More recently, a novel objective function of discriminative acoustic model training, namely lattice-free MMI (LF-MMI), has been proposed and achieved the new state-of-the-art in automatic speech recognition (ASR). Although LF-MMI shows excellent performance in a wide array of ASR tasks with supervised training settings, there is a dearth of work on investigating its effectiveness in the scenario of unsupervised or semi-supervised training. On the other hand, semi-supervised (or self-training) of acoustic model suffers from the problem that it is hard to estimate a good model when only a limited amount of correctly transcribed data is made available. It is also generally acknowledged that the performance of discriminative training is vulnerable to correctness of speech transcripts employed for training. In view of the above, this paper explores two novel extensions to LF-MMI. The first one is to distill knowledge (acoustic training statistics) from a large amount of out-of-domain data to better estimate the seed models for use in semi-supervised training. The second one is to make effective selection of the untranscribed target domain data for semi-supervised training. A series of experiments conducted on the AMI benchmark corpus demonstrate the gains from these two extensions are pronounced and additive, which also reveals their effectiveness and viability.

Original languageEnglish
Title of host publication2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1400-1404
Number of pages5
ISBN (Electronic)9781728132488
DOIs
Publication statusPublished - 2019 Nov
Externally publishedYes
Event2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 - Lanzhou, China
Duration: 2019 Nov 182019 Nov 21

Publication series

Name2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

Conference

Conference2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
CountryChina
CityLanzhou
Period19/11/1819/11/21

ASJC Scopus subject areas

  • Information Systems

Fingerprint Dive into the research topics of 'Semi-supervised training of acoustic models leveraging knowledge transferred from out-of-domain data'. Together they form a unique fingerprint.

  • Cite this

    Lo, T. H., & Chen, B. (2019). Semi-supervised training of acoustic models leveraging knowledge transferred from out-of-domain data. In 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 (pp. 1400-1404). [9023040] (2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPAASC47483.2019.9023040