Robust speech recognition using spatial-temporal feature distribution characteristics

Berlin Chen*, Wei Hau Chen, Shih Hsiang Lin, Wen Yi Chu

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

8 引文 斯高帕斯(Scopus)

摘要

Histogram equalization (HEQ) is one of the most efficient and effective techniques that have been used to reduce the mismatch between training and test acoustic conditions. However, most of the current HEQ methods are merely performed in a dimension-wise manner and without allowing for the contextual relationships between consecutive speech frames. In this paper, we present several novel HEQ approaches that exploit spatial-temporal feature distribution characteristics for speech feature normalization. The automatic speech recognition (ASR) experiments were carried out on the Aurora-2 standard noise-robust ASR task. The performance of the presented approaches was thoroughly tested and verified by comparisons with the other popular HEQ methods. The experimental results show that for clean-condition training, our approaches yield a significant word error rate reduction over the baseline system, and also give competitive performance relative to the other HEQ methods compared in this paper.

原文英語
頁(從 - 到)919-926
頁數8
期刊Pattern Recognition Letters
32
發行號7
DOIs
出版狀態已發佈 - 2011 5月 1

ASJC Scopus subject areas

  • 軟體
  • 訊號處理
  • 電腦視覺和模式識別
  • 人工智慧

指紋

深入研究「Robust speech recognition using spatial-temporal feature distribution characteristics」主題。共同形成了獨特的指紋。

引用此