Towards Robust Mispronunciation Detection and Diagnosis for L2 English Learners with Accent-Modulating Methods

Shao Wei Fan Jiang, Bi Cheng Yan, Tien Hong Lo, Fu An Chao, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

With the acceleration of globalization, more and more people are willing or required to learn second languages (L2). One of the major remaining challenges facing current mispronunciation and diagnosis (MDD) models for use in computer-assisted pronunciation training (CAPT) is to handle speech from L2 learners with a diverse set of accents. In this paper, we set out to mitigate the adverse effects of accent variety in building an L2 English MDD system with end-to-end (E2E) neural models. To this end, we first propose an effective modeling framework that infuses accent features into an E2E MDD model, thereby making the model more accent-aware. Going a step further, we design and present disparate accent-aware modules to perform accent-aware modulation of acoustic features in a finer-grained manner, so as to enhance the discriminating capability of the resulting MDD model. Extensive sets of experiments conducted on the L2-ARCTIC benchmark dataset show the merits of our MDD model, in comparison to some existing E2E-based strong baselines and the celebrated pronunciation scoring based method.

Original languageEnglish
Title of host publication2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1065-1070
Number of pages6
ISBN (Electronic)9781665437394
DOIs
Publication statusPublished - 2021
Event2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Cartagena, Colombia
Duration: 2021 Dec 132021 Dec 17

Publication series

Name2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings

Conference

Conference2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021
Country/TerritoryColombia
CityCartagena
Period2021/12/132021/12/17

Keywords

  • accent modeling
  • accented speech
  • computer-assisted pronunciation training
  • mispronunciation detection and diagnosis
  • multi-task learning

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Towards Robust Mispronunciation Detection and Diagnosis for L2 English Learners with Accent-Modulating Methods'. Together they form a unique fingerprint.

Cite this