Gated Adapters with Balanced Activation for Effective Contextual Speech Recognition

  • Yu Chun Liu
  • , Yi Cheng Wang
  • , Li Ting Pai
  • , Jia Liang Lu
  • , Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In end-to-end (E2E) automatic speech recognition (ASR), ac-curately recognizing rare words, such as named entities, re-mains a significant challenge. Although existing contextual biasing techniques have improved recognition rates for named entities, they often incur substantial computational costs and the risk of false biasing. Recent research has shown that integrating gating mechanisms with contextual biasing adapters can dynamically regulate activation, effectively re-ducing unnecessary computational overhead. However, we observed that gating mechanisms tend not to activate when encountering particularly rare instances within named entities. To address this challenge, we combined the gating mecha-nism with a novel activation-balanced objective, resulting in the gate-balanced adapter. This approach not only sustains high recognition rates for named entities but also significantly reduces character error rates (CER) and overall computational load. A series of experiments were conducted on the AISHELL-1 dataset, and the results showed approximately a 1.2% reduction in CER compared to the baseline, highlighting its potential for practical applications.

Original languageEnglish
Title of host publication2024 27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Proceedings
EditorsMing-Hsiang Su, Jui-Feng Yeh, Yuan-Fu Liao, Chi-Chun Lee, Yu Taso
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331506032
DOIs
Publication statusPublished - 2024
Event27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Hsinchu, Taiwan
Duration: 2024 Oct 172024 Oct 19

Publication series

Name2024 27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Proceedings

Conference

Conference27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024
Country/TerritoryTaiwan
CityHsinchu
Period2024/10/172024/10/19

Keywords

  • Automatic speech recognition
  • contex-tual biasing
  • contextual adapter
  • long-tail learning

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Library and Information Sciences
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Gated Adapters with Balanced Activation for Effective Contextual Speech Recognition'. Together they form a unique fingerprint.

Cite this