Exploiting Discrete Cosine Transform Features in Speech Enhancement Technique FullSubNet+

Yu Sheng Tsao, Berlin Chen, Jeih Weih Hung

研究成果: 書貢獻/報告類型會議論文篇章

摘要

The highly effective deep learning-based technique FullSubNet+ employs a full-band and sub-band fusion model to fulfill the speech enhancement task. FullSubNet+ exploits the short-time magnitude spectrogram, real-and imaginary parts of the complex-valued spectrogram to learn the deep neural network that mainly comprises multi-scale time-sensitive channel attention (MulCA) modules and stacked temporal convolution network (TCN) blocks. To capture the phase information of input time-domain signals more simply, we propose using the short-time DCT-based spectrogram as an alternative for the real and imaginary spectrograms to be an input source to learn the FullSubNet+ framework. The preliminary experiments conducted with the VoiceBank-DEMAND task indicate that exploiting STDCT spectrograms in FullSubNet+ achieves higher objective speech quality and intelligibility in terms of PESQ and STOI metric scores, respectively, for the test set compared with the original FullSubNet+ arrangement. In addition, the STDCT-wise FullSubNet+ obtains a real-time factor (RTF) of 0.229, lower than 0.260, the RTF for the original FullSubNet+.

原文英語
主出版物標題Proceedings - 2022 IET International Conference on Engineering Technologies and Applications, IET-ICETA 2022
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781665491389
DOIs
出版狀態已發佈 - 2022
事件2022 IET International Conference on Engineering Technologies and Applications, IET-ICETA 2022 - Changhua, 臺灣
持續時間: 2022 10月 142022 10月 16

出版系列

名字Proceedings - 2022 IET International Conference on Engineering Technologies and Applications, IET-ICETA 2022

會議

會議2022 IET International Conference on Engineering Technologies and Applications, IET-ICETA 2022
國家/地區臺灣
城市Changhua
期間2022/10/142022/10/16

ASJC Scopus subject areas

  • 人工智慧
  • 電腦科學應用
  • 電腦視覺和模式識別
  • 工程(雜項)
  • 電氣與電子工程
  • 儀器
  • 運輸

指紋

深入研究「Exploiting Discrete Cosine Transform Features in Speech Enhancement Technique FullSubNet+」主題。共同形成了獨特的指紋。

引用此