探究文本提示於端對端發音訓練系統之應用

Translated title of the contribution: Exploiting Text Prompts for the Development of an End-to-End Computer-Assisted Pronunciation Training System

Yu Sen Cheng, Tien Hong Lo, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

More recently, there is a growing demand for the development of computer assisted pronunciation training (CAPT) systems, which can be capitalized to automatically assess the pronunciation quality of L2 learners. However, current CAPT systems that build on end-to-end (E2E) neural network architectures still fall short of expectation for the detection of mispronunciations. This is partly because most of their model components are simply designed and optimized for automatic speech recognition (ASR), but are not specifically tailored for CAPT. Unlike ASR that aims to recognize the utterance of a given speaker (even when poorly pronounced) as correctly as possible, CAPT manages to detect pronunciation errors as subtlety as possible. In view of this, we seek to develop an E2E neural CAPT method that makes use of two disparate encoders to generate embedding of an L2 speaker’s test utterance and the corresponding canonical pronunciations in the given text prompt, respectively. The outputs of the two encoders are fed into a decoder through a hierarchical attention mechanism (HAM), with the purpose to enable the decoder to focus more on detecting mispronunciations. A series of experiments conducted on an L2 Mandarin Chinese speech corpus have demonstrated the effectiveness of our method in terms of different evaluation metrics, when compared with some state-of-the-art E2E neural CAPT methods.

Translated title of the contributionExploiting Text Prompts for the Development of an End-to-End Computer-Assisted Pronunciation Training System
Original languageChinese (Traditional)
Title of host publicationROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing
EditorsJenq-Haur Wang, Ying-Hui Lai, Lung-Hao Lee, Kuan-Yu Chen, Hung-Yi Lee, Chi-Chun Lee, Syu-Siang Wang, Hen-Hsen Huang, Chuan-Ming Liu
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages290-303
Number of pages14
ISBN (Electronic)9789869576932
Publication statusPublished - 2020
Event32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020 - Taipei, Taiwan
Duration: 2020 Sept 242020 Sept 26

Publication series

NameROCLING 2020 - 32nd Conference on Computational Linguistics and Speech Processing

Conference

Conference32nd Conference on Computational Linguistics and Speech Processing, ROCLING 2020
Country/TerritoryTaiwan
CityTaipei
Period2020/09/242020/09/26

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'Exploiting Text Prompts for the Development of an End-to-End Computer-Assisted Pronunciation Training System'. Together they form a unique fingerprint.

Cite this