CROSS-DOMAIN SINGLE-CHANNEL SPEECH ENHANCEMENT MODEL WITH BI-PROJECTION FUSION MODULE FOR NOISE-ROBUST ASR

Fu An Chao, Jeih Weih Hung, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

12 引文 斯高帕斯(Scopus)

摘要

In recent decades, many studies have suggested that phase information is crucial for speech enhancement (SE), and time-domain single-channel speech enhancement techniques have shown promise in noise suppression and robust automatic speech recognition (ASR). This paper presents a continuation of the above lines of research and explores two effective SE methods that consider phase information in time domain and frequency domain of speech signals, respectively. Going one step further, we put forward a novel cross-domain speech enhancement model and a bi-projection fusion (BPF) mechanism for noise-robust ASR. To evaluate the effectiveness of our proposed method, we conduct an extensive set of experiments on the publicly-available Aishell-1 Mandarin benchmark speech corpus. The evaluation results confirm the superiority of our proposed method in relation to a few current top-of-the-line time-domain and frequency-domain SE methods in both enhancement and ASR evaluation metrics for the test set of scenarios contaminated with seen and unseen noise, respectively.

原文英語
主出版物標題2021 IEEE International Conference on Multimedia and Expo, ICME 2021
發行者IEEE Computer Society
ISBN(電子)9781665438643
DOIs
出版狀態已發佈 - 2021
事件2021 IEEE International Conference on Multimedia and Expo, ICME 2021 - Shenzhen, 中国
持續時間: 2021 7月 52021 7月 9

出版系列

名字Proceedings - IEEE International Conference on Multimedia and Expo
ISSN(列印)1945-7871
ISSN(電子)1945-788X

會議

會議2021 IEEE International Conference on Multimedia and Expo, ICME 2021
國家/地區中国
城市Shenzhen
期間2021/07/052021/07/09

ASJC Scopus subject areas

  • 電腦網路與通信
  • 電腦科學應用

指紋

深入研究「CROSS-DOMAIN SINGLE-CHANNEL SPEECH ENHANCEMENT MODEL WITH BI-PROJECTION FUSION MODULE FOR NOISE-ROBUST ASR」主題。共同形成了獨特的指紋。

引用此