用於語音增強之偽影感知加權損失函數

Translated title of the contribution: AaWLoss: An Artifact-aware Weighted Loss Function for Speech Enhancement

En Lun Yu, Kuan Hsun Ho, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The Speech Enhancement (SE) system not only enhances the perceptual quality of speech but also make the ASR performance robust in noisy enviornments when integrating with ASR systems. However, single-channel SE may generate detrimental artifacts to ASR recognition, leading to recognition errors. Recent research indicates that by introducing the novel SE loss function NAaLoss and fine-tuning the model, the generation of artifacts can be effectively reduced. Nonetheless, this approach still needs to be revised in its underlying assumptions. Therefore, we extensively analyze this method in this study and conduct numerous experiments and case studies to identify the inconsistencies. To address this, we propose an improved loss function, AaWLoss. AaWLoss successfully resolves the potential loss of noise-condition artifact suppression inherent in NAaLoss under the same settings through modifications and optimizations. Furthermore, AaWLoss achieves peak performance in suppressing artifacts under clean conditions, even adding information beneficial for ASR recognition to the enhanced clean speech.

Translated title of the contributionAaWLoss: An Artifact-aware Weighted Loss Function for Speech Enhancement
Original languageChinese (Traditional)
Title of host publicationROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
EditorsJheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages71-78
Number of pages8
ISBN (Electronic)9789869576963
Publication statusPublished - 2023
Event35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, Taiwan
Duration: 2023 Oct 202023 Oct 21

Publication series

NameROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

Conference

Conference35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Country/TerritoryTaiwan
CityTaipei City
Period2023/10/202023/10/21

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'AaWLoss: An Artifact-aware Weighted Loss Function for Speech Enhancement'. Together they form a unique fingerprint.

Cite this