An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Tien Hong Lo*, Fu An Chao, Tzu I. Wu, Yao Ting Sung, Berlin Chen*

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

摘要

Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner's speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss re-weighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.

原文英語
主出版物標題Findings of the Association for Computational Linguistics
主出版物子標題NAACL 2024 - Findings
編輯Kevin Duh, Helena Gomez, Steven Bethard
發行者Association for Computational Linguistics (ACL)
頁面1352-1362
頁數11
ISBN(電子)9798891761193
DOIs
出版狀態已發佈 - 2024
事件2024 Findings of the Association for Computational Linguistics: NAACL 2024 - Mexico City, 墨西哥
持續時間: 2024 6月 162024 6月 21

出版系列

名字Findings of the Association for Computational Linguistics: NAACL 2024 - Findings

會議

會議2024 Findings of the Association for Computational Linguistics: NAACL 2024
國家/地區墨西哥
城市Mexico City
期間2024/06/162024/06/21

ASJC Scopus subject areas

  • 計算機理論與數學
  • 軟體

指紋

深入研究「An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution」主題。共同形成了獨特的指紋。

引用此