TY - GEN
T1 - 改善多細粒度的發音評測上資料不平衡的問題
AU - Lin, Meng Shin
AU - Wang, Hsin Wei
AU - Lo, Tien Hong
AU - Chen, Berlin
N1 - Publisher Copyright:
© 2023 ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Automatic Pronunciation Assessment (APA) aims to quantify non-native (L2) learners' pronunciation proficiency in a specific language. With technological advancements, APA now evaluates various aspects of pronunciation, from phoneme level to sentence level, including accuracy, fluency, stress, and more. However, current APA methods rely on the Mean Squared Error (MSE) loss function, which struggles with imbalanced labels across different levels of granularity. This imbalance affects model generalizability and fairness, as MSE tends to underestimate rare labels. Despite these issues, existing research has not adequately addressed data imbalance. To address this gap, we draw inspiration from class-balanced loss functions in visual classification. Our approach involves resampling and introducing a trainable variable to narrow the gap between training and testing sets in imbalanced regression tasks, aiming to alleviate label imbalance effects in APA. Evaluating our method on the Speechocean762 dataset, known for significant word-level label imbalance, we observe remarkable enhancements in performance. Our proposed approach shows promise in tackling challenges stemming from imbalanced data in automatic pronunciation assessment.
AB - Automatic Pronunciation Assessment (APA) aims to quantify non-native (L2) learners' pronunciation proficiency in a specific language. With technological advancements, APA now evaluates various aspects of pronunciation, from phoneme level to sentence level, including accuracy, fluency, stress, and more. However, current APA methods rely on the Mean Squared Error (MSE) loss function, which struggles with imbalanced labels across different levels of granularity. This imbalance affects model generalizability and fairness, as MSE tends to underestimate rare labels. Despite these issues, existing research has not adequately addressed data imbalance. To address this gap, we draw inspiration from class-balanced loss functions in visual classification. Our approach involves resampling and introducing a trainable variable to narrow the gap between training and testing sets in imbalanced regression tasks, aiming to alleviate label imbalance effects in APA. Evaluating our method on the Speechocean762 dataset, known for significant word-level label imbalance, we observe remarkable enhancements in performance. Our proposed approach shows promise in tackling challenges stemming from imbalanced data in automatic pronunciation assessment.
KW - Automatic Pronunciation Assessment
KW - data imbalanced
KW - regression loss function
UR - http://www.scopus.com/inward/record.url?scp=85184839537&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184839537&partnerID=8YFLogxK
M3 - 會議論文篇章
AN - SCOPUS:85184839537
T3 - ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
SP - 134
EP - 140
BT - ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
A2 - Wu, Jheng-Long
A2 - Su, Ming-Hsiang
A2 - Huang, Hen-Hsen
A2 - Tsao, Yu
A2 - Tseng, Hou-Chiang
A2 - Chang, Chia-Hui
A2 - Lee, Lung-Hao
A2 - Liao, Yuan-Fu
A2 - Ma, Wei-Yun
PB - The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
T2 - 35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Y2 - 20 October 2023 through 21 October 2023
ER -