NAaLOSS: Rethinking the Objective of Speech Enhancement

Kuan Hsun Ho*, En Lun Yu, Jeih Weih Hung, Berlin Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Reducing noise interference is crucial for automatic speech recognition (ASR) in a real-world scenario. However, most single-channel speech enhancement (SE) generates 'processing artifacts' that negatively affect ASR performance. Hence, in this study, we suggest a Noise- and Artifacts-aware loss function, NAaLoss, to ameliorate the influence of artifacts from a novel perspective. NAaLoss considers the loss of estimation, de-artifact, and noise ignorance, enabling the learned SE to individually model speech, artifacts, and noise. We examine two SE models (simple/advanced) learned with NAaLoss under various input scenarios (clean/noisy) using two configurations of the ASR system (with/without noise robustness). Experiments reveal that NAaLoss significantly improves the ASR performance of most setups while preserving the quality of SE toward perception and intelligibility. Furthermore, we visualize artifacts through waveforms and spectrograms, and explain their impact on ASR.

Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing, MLSP 2023
EditorsDanilo Comminiello, Michele Scarpiniti
PublisherIEEE Computer Society
ISBN (Electronic)9798350324112
DOIs
Publication statusPublished - 2023
Event33rd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2023 - Rome, Italy
Duration: 2023 Sept 172023 Sept 20

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2023-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference33rd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2023
Country/TerritoryItaly
CityRome
Period2023/09/172023/09/20

Keywords

  • noise-robust speech enhancement
  • processing artifacts
  • single-channel speech enhancement

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'NAaLOSS: Rethinking the Objective of Speech Enhancement'. Together they form a unique fingerprint.

Cite this