Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

Automatic Speech Assessment (ASA) has seen notable advancements with the utilization of self-supervised features (SSL) in recent research.However, a key challenge in ASA lies in the imbalanced distribution of data, particularly evident in English test datasets.To address this challenge, we approach ASA as an ordinal classification task, introducing Weighted Vectors Ranking Similarity (W-RankSim) as a novel regularization technique.W-RankSim encourages closer proximity of weighted vectors in the output layer for similar classes, implying that feature vectors with similar labels would be gradually nudged closer to each other as they converge towards corresponding weighted vectors.Extensive experimental evaluations confirm the effectiveness of our approach in improving ordinal classification performance for ASA.Furthermore, we propose a hybrid model that combines SSL and handcrafted features, showcasing how the inclusion of handcrafted features enhances performance in an ASA system.

Original languageEnglish
Pages (from-to)4004-4008
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
Publication statusPublished - 2024
Event25th Interspeech Conferece 2024 - Kos Island, Greece
Duration: 2024 Sept 12024 Sept 5

Keywords

  • automatic speech assessment
  • imbalanced data
  • ordinal classification

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies'. Together they form a unique fingerprint.

Cite this