Mandarin Chinese Mispronunciation Detection and Diagnosis Leveraging Deep Neural Network Based Acoustic Modeling and Training Techniques

Berlin Chen*, Yao Chi Hsu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapter

5 Citations (Scopus)

Abstract

Automatic mispronunciation detection and diagnosis are two critical and integral components of a computer-assisted pronunciation training (CAPT) system, collectively facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. In this chapter, we will first briefly introduce the latest trends and developments in mispronunciation detection and diagnosis with state-of-the-art automatic speech recognition (ASR) methodologies, especially those using deep neural network based acoustic models. Afterward, we present an effective training approach that estimates the deep neural network based acoustic models involved in the mispronunciation detection process by optimizing an objective directly linked to the ultimate performance evaluation metric. We also investigate the extent to which the subsequent mispronunciation diagnosis process can benefit from the use of these specifically trained acoustic models. For this purpose, we recast mispronunciation diagnosis as a classification problem and a set of indicative features are derived. A series of experiments on a Mandarin Chinese mispronunciation detection and diagnosis task are conducted to evaluate the performance merits of such an approach.

Original languageEnglish
Title of host publicationChinese Language Learning Sciences
PublisherSpringer Nature
Pages217-234
Number of pages18
DOIs
Publication statusPublished - 2019

Publication series

NameChinese Language Learning Sciences
ISSN (Print)2520-1719
ISSN (Electronic)2520-1727

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Mandarin Chinese Mispronunciation Detection and Diagnosis Leveraging Deep Neural Network Based Acoustic Modeling and Training Techniques'. Together they form a unique fingerprint.

Cite this