TENET: A Time-Reversal Enhancement Network for Noise-Robust ASR

Fu An Chao, Shao Wei Fan Jiang, Bi Cheng Yan, Jeih Weih Hung, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Due to the unprecedented breakthroughs brought about by deep learning, speech enhancement (SE) techniques have been developed rapidly and play an important role prior to acoustic modeling so as to mitigate noise effects on speech. To increase the perceptual quality of speech, the current state-of-the-art in the realm of SE adopts adversarial training by connecting an objective metric to the discriminator. However, there is no guarantee that optimizing the perceptual quality of speech will necessarily lead to improved automatic speech recognition (ASR) performance. In this study, we present TENET††Inspired by the movie - TENET, Christopher Nolan, 2020., ∗∗Some of the enhanced audio samples can be found from https://fuann.github.io/TENET., a novel Time-reversal Enhancement NETwork, which leverages the transformation of an input noisy signal itself, i.e., the time-reversed version, in conjunction with a Siamese network and a complex dual-path Transformer to promote SE performance for noise-robust ASR. Extensive experiments conducted on the Voicebank-DEMAND dataset show that TENET can achieve stellar results compared to a few top-of-the-line methods in terms of both SE and ASR evaluation metrics. To demonstrate the model generalization ability, we further evaluate TENET on the test set of scenarios contaminated with unseen noise, and the results also confirm the superiority of this promising method.

Original languageEnglish
Title of host publication2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages55-61
Number of pages7
ISBN (Electronic)9781665437394
DOIs
Publication statusPublished - 2021
Event2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Cartagena, Colombia
Duration: 2021 Dec 132021 Dec 17

Publication series

Name2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings

Conference

Conference2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021
Country/TerritoryColombia
CityCartagena
Period2021/12/132021/12/17

Keywords

  • Automatic Speech Recognition
  • Deep Learning
  • Siamese Network
  • Speech Enhancement
  • Time Reversal

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'TENET: A Time-Reversal Enhancement Network for Noise-Robust ASR'. Together they form a unique fingerprint.

Cite this