探究文本提示於端對端發音訓練系統之應用

Yu Sen Cheng, Tien Hong Lo, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

摘要

More recently, there is a growing demand for the development of computer assisted pronunciation training (CAPT) systems, which can be capitalized to automatically assess the pronunciation quality of L2 learners. However, current CAPT systems that build on end-to-end (E2E) neural network architectures still fall short of expectation for the detection of mispronunciations. This is partly because most of their model components are simply designed and optimized for automatic speech recognition (ASR), but are not specifically tailored for CAPT. Unlike ASR that aims to recognize the utterance of a given speaker (even when poorly pronounced) as correctly as possible, CAPT manages to detect pronunciation errors as subtlety as possible. In view of this, we seek to develop an E2E neural CAPT method that makes use of two disparate encoders to generate embedding of an L2 speaker’s test utterance and the corresponding canonical pronunciations in the given text prompt, respectively. The outputs of the two encoders are fed into a decoder through a hierarchical attention mechanism (HAM), with the purpose to enable the decoder to focus more on detecting mispronunciations. A series of experiments conducted on an L2 Mandarin Chinese speech corpus have demonstrated the effectiveness of our method in terms of different evaluation metrics, when compared with some state-of-the-art E2E neural CAPT methods.

貢獻的翻譯標題Exploiting Text Prompts for the Development of an End-to-End Computer-Assisted Pronunciation Training System
原文繁體中文
主出版物標題ROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing
編輯Jenq-Haur Wang, Ying-Hui Lai, Lung-Hao Lee, Kuan-Yu Chen, Hung-Yi Lee, Chi-Chun Lee, Syu-Siang Wang, Hen-Hsen Huang, Chuan-Ming Liu
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面290-303
頁數14
ISBN(電子)9789869576932
出版狀態已發佈 - 2020
事件32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020 - Taipei, 臺灣
持續時間: 2020 9月 242020 9月 26

出版系列

名字ROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing

會議

會議32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020
國家/地區臺灣
城市Taipei
期間2020/09/242020/09/26

Keywords

  • Computer assisted pronunciation training
  • end-to-end speech recognition
  • hierarchical attention mechanism
  • mispronunciation detection
  • mispronunciation diagnosis

ASJC Scopus subject areas

  • 語言與語言學
  • 言語和聽力

指紋

深入研究「探究文本提示於端對端發音訓練系統之應用」主題。共同形成了獨特的指紋。

引用此