改善多細粒度的發音評測上資料不平衡的問題

Translated title of the contribution: Addressing the issue of Data Imbalance in Multi-granularity Pronunciation Assessment

Meng Shin Lin, Hsin Wei Wang, Tien Hong Lo, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automatic Pronunciation Assessment (APA) aims to quantify non-native (L2) learners' pronunciation proficiency in a specific language. With technological advancements, APA now evaluates various aspects of pronunciation, from phoneme level to sentence level, including accuracy, fluency, stress, and more. However, current APA methods rely on the Mean Squared Error (MSE) loss function, which struggles with imbalanced labels across different levels of granularity. This imbalance affects model generalizability and fairness, as MSE tends to underestimate rare labels. Despite these issues, existing research has not adequately addressed data imbalance. To address this gap, we draw inspiration from class-balanced loss functions in visual classification. Our approach involves resampling and introducing a trainable variable to narrow the gap between training and testing sets in imbalanced regression tasks, aiming to alleviate label imbalance effects in APA. Evaluating our method on the Speechocean762 dataset, known for significant word-level label imbalance, we observe remarkable enhancements in performance. Our proposed approach shows promise in tackling challenges stemming from imbalanced data in automatic pronunciation assessment.

Translated title of the contributionAddressing the issue of Data Imbalance in Multi-granularity Pronunciation Assessment
Original languageChinese (Traditional)
Title of host publicationROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
EditorsJheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages134-140
Number of pages7
ISBN (Electronic)9789869576963
Publication statusPublished - 2023
Event35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, Taiwan
Duration: 2023 Oct 202023 Oct 21

Publication series

NameROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

Conference

Conference35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Country/TerritoryTaiwan
CityTaipei City
Period2023/10/202023/10/21

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'Addressing the issue of Data Imbalance in Multi-granularity Pronunciation Assessment'. Together they form a unique fingerprint.

Cite this