Effective Noise-Aware Data Simulation For Domain-Adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

Chien Chun Wang*, Li Wei Chen, Hung Shin Lee, Berlin Chen, Hsin Min Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain, leading to a mismatch between training and test conditions. This study puts forward a novel data simulation method to address this issue, leveraging noise-extractive techniques and generative adversarial networks (GANs) with only limited target noisy speech data. Notably, our method employs a noise encoder to extract noise embeddings from target-domain data. These embeddings aptly guide the generator to synthesize utterances acoustically fitted to the target domain while authentically preserving the phonetic content of the input clean speech. Furthermore, we introduce the notion of dynamic stochastic perturbation, which can inject controlled perturbations into the noise embeddings during inference, thereby enabling the model to generalize well to unseen noise conditions. Experiments on the VoiceBank-DEMAND benchmark dataset demonstrate that our domain-adaptive SE method outperforms an existing strong baseline based on data simulation.

Original languageEnglish
Title of host publicationProceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages309-316
Number of pages8
ISBN (Electronic)9798350392258
DOIs
Publication statusPublished - 2024
Event2024 IEEE Spoken Language Technology Workshop, SLT 2024 - Macao, China
Duration: 2024 Dec 22024 Dec 5

Publication series

NameProceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024

Conference

Conference2024 IEEE Spoken Language Technology Workshop, SLT 2024
Country/TerritoryChina
CityMacao
Period2024/12/022024/12/05

Keywords

  • data augmentation
  • data simulation
  • domain adaptation
  • speech enhancement

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Media Technology
  • Instrumentation
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Effective Noise-Aware Data Simulation For Domain-Adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation'. Together they form a unique fingerprint.

Cite this