Exploring Feature Enhancement in the Modulation Spectrum Domain via Ideal Ratio Mask for Robust Speech Recognition

Bi Cheng Yan, Meng Che Wu, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

摘要

Development of robustness techniques is of paramount importance to the success of automatic speech recognition (ASR) systems. In this paper, we present a novel use of the ideal ratio mask (IRM) method to improve ASR robustness. IRM was originally proposed for time-frequency (T-F) masking-based speech enhancement and has shown considerable promise in preserving the intelligibility of a noisy mixture signal. Further, IRM is alternatively used to normalize the intermediate representations of speech feature vector sequences, in a holistic manner, for both training and test utterances. Finally, we instead treat IRM as a data augmentation method, conducted on speech feature vectors of training utterances or their intermediate representations, to generate additional augmented data for increasing the diversity of training data. A series of experiments carried out on the standard Aurora-4 database and task confirm the effectiveness of our methods.

原文英語
主出版物標題2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面759-763
頁數5
ISBN(電子)9789881476883
出版狀態已發佈 - 2020 十二月 7
對外發佈
事件2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Virtual, Auckland, 新西兰
持續時間: 2020 十二月 72020 十二月 10

出版系列

名字2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings

會議

會議2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020
國家/地區新西兰
城市Virtual, Auckland
期間2020/12/072020/12/10

ASJC Scopus subject areas

  • 人工智慧
  • 電腦網路與通信
  • 電腦視覺和模式識別
  • 硬體和架構
  • 訊號處理
  • 決策科學(雜項)
  • 儀器

指紋

深入研究「Exploring Feature Enhancement in the Modulation Spectrum Domain via Ideal Ratio Mask for Robust Speech Recognition」主題。共同形成了獨特的指紋。

引用此