ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning

Kuan Hsun Ho, Jeih Weih Hung, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech separation has recently made significant progress thanks to the fine-grained vision used in time-domain methods. However, several studies have shown that adopting Short-Time Fourier Transform (STFT) for feature extraction could be beneficial when encountering harsher conditions, such as noise or reverberation. Therefore, we propose a magnitude-conditioned time-domain framework, ConSep, to inherit the beneficial characteristics. The experiment shows that ConSep promotes performance in anechoic, noisy, and reverberant settings compared to two celebrated methods, SepFormer and BiSep. Furthermore, we visualize the components of ConSep to strengthen the advantages and cohere with the actualities we have found in preliminary studies.

Original languageEnglish
Title of host publication2023 24th International Conference on Digital Signal Processing, DSP 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350339598
DOIs
Publication statusPublished - 2023
Event24th International Conference on Digital Signal Processing, DSP 2023 - Rhodes, Greece
Duration: 2023 Jun 112023 Jun 13

Publication series

NameInternational Conference on Digital Signal Processing, DSP
Volume2023-June

Conference

Conference24th International Conference on Digital Signal Processing, DSP 2023
Country/TerritoryGreece
CityRhodes
Period2023/06/112023/06/13

Keywords

  • conditioning
  • cross-domain
  • magnitude
  • multi-resolution
  • reverberation
  • speech separation

ASJC Scopus subject areas

  • Signal Processing

Fingerprint

Dive into the research topics of 'ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning'. Together they form a unique fingerprint.

Cite this