摘要
Due to the enormous progress in deep learning, speech enhancement (SE) techniques have shown promising efficacy and play a pivotal role prior to an automatic speech recognition (ASR) system to mitigate the noise effects. In this article, we put forward a novel cross-domain time-reversal enhancement network (CD-TENET). CD-TENET leverages the time-reversed version of a speech signal and two effective features that consider the phase information of a speech signal in the time domain and the frequency domain, respectively, to promote SE performance for noise-robust ASR. Extensive experiments demonstrate that CD-TENET can not only recover the original speech effectively but also improve both SE and ASR performance simultaneously. More surprisingly, the proposed CD-TENET method can offer a marked relative word error rate reduction on test utterances of scenarios contaminated with unseen noises when compared to a strong baseline with the multicondition training setting.
| 原文 | 英語 |
|---|---|
| 頁(從 - 到) | 114-124 |
| 頁數 | 11 |
| 期刊 | IEEE Multimedia |
| 卷 | 29 |
| 發行號 | 1 |
| DOIs | |
| 出版狀態 | 已發佈 - 2022 |
ASJC Scopus subject areas
- 軟體
- 訊號處理
- 媒體技術
- 硬體和架構
- 電腦科學應用
指紋
深入研究「Time-Reversal Enhancement Network With Cross-Domain Information for Noise-Robust Speech Recognition」主題。共同形成了獨特的指紋。引用此
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS