新穎基於預訓練語言表示模型於語音辨識重新排序之研究

Shih Hsuan Chiu, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

摘要

This paper proposes two BERT-based models for accurately rescoring (reranking) N-best speech recognition hypothesis lists. Reranking the N-best hypothesis lists decoded from the acoustic model has been proven to improve the performance in a two-stage automatic speech recognition (ASR) systems. However, with the rise of pre-trained contextualized language models, they have achieved state-of-the-art performance in many NLP applications, but there is a dearth of work on investigating its effectiveness in ASR. In this paper, we develop simple yet effective methods for improving ASR by reranking the N-best hypothesis lists leveraging BERT (bidirectional encoder representations from Transformers). Specifically, we treat reranking N-best hypotheses as a downstream task by simply fine-tuning the pre-trained BERT. We proposed two BERT-based reranking language models: (1) uniBERT: ideal unigram elicited from a given N-best list taking advantage of BERT to assist a LSTMLM, (2) classBERT: treating the N-best lists reranking as a multi-class classification problem. These models attempt to harness the power of BERT to reranking the N-best hypothesis lists generated in the ASR initial pass. Experiments on the benchmark AMI dataset show that the proposed reranking methods outperform the baseline LSTMLM which is a strong and widely-used competitor with 3.14% improvement in word error rate (WER).

貢獻的翻譯標題Innovative Pretrained-based Reranking Language Models for N-best Speech Recognition Lists
原文繁體中文
主出版物標題ROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing
編輯Jenq-Haur Wang, Ying-Hui Lai, Lung-Hao Lee, Kuan-Yu Chen, Hung-Yi Lee, Chi-Chun Lee, Syu-Siang Wang, Hen-Hsen Huang, Chuan-Ming Liu
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面148-162
頁數15
ISBN(電子)9789869576932
出版狀態已發佈 - 2020
事件32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020 - Taipei, 臺灣
持續時間: 2020 9月 242020 9月 26

出版系列

名字ROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing

會議

會議32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020
國家/地區臺灣
城市Taipei
期間2020/09/242020/09/26

Keywords

  • Automatic Speech Recognition
  • BERT
  • Language Models
  • N-best Lists Reranking

ASJC Scopus subject areas

  • 語言與語言學
  • 言語和聽力

指紋

深入研究「新穎基於預訓練語言表示模型於語音辨識重新排序之研究」主題。共同形成了獨特的指紋。

引用此