Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection

Tzu Ting Yang*, Hsin Wei Wang, Yi Cheng Wang, Berlin Chen

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

摘要

Code-switching - where multilingual speakers alternately switch between languages during conversations - still poses significant challenges to end-to-end (E2E) automatic speech recognition (ASR) systems due to phenomena of both acoustic and semantic confusion. This issue arises because ASR systems struggle to handle the rapid alternation of languages effectively, which often leads to significant performance degradation. Our main contributions are at least threefold: First, we incorporate language identification (LID) information into several intermediate layers of the encoder, aiming to enrich output embeddings with more detailed language information. Secondly, through the novel application of language boundary alignment loss, the subsequent ASR modules are enabled to more effectively utilize the knowledge of internal language posteriors. Third, we explore the feasibility of using language posteriors to facilitate deep interaction between shared encoder and language-specific encoders. Through comprehensive experiments on the SEAME corpus, we have verified that our proposed method outperforms the prior-art method, disentangle based mixture-of-experts (D-MoE), further enhancing the acuity of the encoder to languages.

原文英語
主出版物標題Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024
發行者Institute of Electrical and Electronics Engineers Inc.
頁面476-481
頁數6
ISBN(電子)9798350392258
DOIs
出版狀態已發佈 - 2024
事件2024 IEEE Spoken Language Technology Workshop, SLT 2024 - Macao, 中国
持續時間: 2024 12月 22024 12月 5

出版系列

名字Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024

會議

會議2024 IEEE Spoken Language Technology Workshop, SLT 2024
國家/地區中国
城市Macao
期間2024/12/022024/12/05

ASJC Scopus subject areas

  • 電腦視覺和模式識別
  • 硬體和架構
  • 媒體技術
  • 儀器
  • 語言和語言學

指紋

深入研究「Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection」主題。共同形成了獨特的指紋。

引用此