DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition

Yi Cheng Wang*, Hsin Wei Wang, Bi Cheng Yan, Chi Han Lin, Berlin Chen*

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

摘要

End-to-end automatic speech recognition (E2E ASR) systems often suffer from mistranscription of domain-specific phrases, such as named entities, sometimes leading to catastrophic failures in downstream tasks. A family of fast and lightweight named entity correction (NEC) models for ASR have recently been proposed, which normally build on phonetic-level edit distance algorithms and have shown impressive NEC performance. However, as the named entity (NE) list grows, the problems of phonetic confusion in the NE list are exacerbated; for example, homophone ambiguities increase substantially. In view of this, we proposed a novel Description Augmented Named entity Cor-rEctoR (dubbed DANCER), which leverages entity descriptions to provide additional information to facilitate mitigation of phonetic confusion for NEC on ASR transcription. To this end, an efficient entity description augmented masked language model (EDA-MLM) comprised of a dense retrieval model is introduced, enabling MLM to adapt swiftly to domain-specific entities for the NEC task. A series of experiments conducted on the AISHELL-1 and Homophone datasets confirm the effectiveness of our modeling approach. DANCER outperforms a strong baseline, the phonetic edit-distance-based NEC model (PED-NEC), by a character error rate (CER) reduction of about 7% relatively on AISHELL-1 for named entities. More notably, when tested on Homophone that contains named entities of high phonetic confusion, DANCER offers a more pronounced CER reduction of 46% relatively over PED-NEC for named entities. The code is available at https://github.com/Amiannn/Dancer.

原文英語
主出版物標題2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
編輯Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
發行者European Language Resources Association (ELRA)
頁面4333-4342
頁數10
ISBN(電子)9782493814104
出版狀態已發佈 - 2024
事件Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, 意大利
持續時間: 2024 5月 202024 5月 25

出版系列

名字2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

會議

會議Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
國家/地區意大利
城市Hybrid, Torino
期間2024/05/202024/05/25

ASJC Scopus subject areas

  • 理論電腦科學
  • 計算機理論與數學
  • 電腦科學應用

指紋

深入研究「DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition」主題。共同形成了獨特的指紋。

引用此