Histogram equalization of contextual statistics of speech features for robust speech recognition

Hsin Ju Hsieh, Berlin Chen*, Jeih weih Hung

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

3 引文 斯高帕斯(Scopus)

摘要

In the recent past, we have witnessed a flurry of research activity aimed at the development of novel and ingenious robustness methods for automatic speech recognition (ASR). Among them, histogram equalization (HEQ) of speech features constitutes one most prominent and successful line of research due to its inherent neat formulation and remarkable performance. In this paper, we adopt an effective modeling framework for joint equalization of spatial-temporal contextual statistics of speech features. On top of that, we explore various combinations of simple differencing and averaging operations to render the contextual relationships of feature vector components, not only between different dimensions but also between consecutive speech frames, in the HEQ process. Furthermore, several variants of HEQ are investigated and integrated into the proposed modeling framework to efficiently compensate for the effects of noise interference on the feature vector components. In addition, the utilities of the methods deduced from this framework and several existing robustness methods are analyzed and compared extensively. All experiments were carried out on the Aurora-2 database and task, and were further verified on the Aurora-4 database and task. Empirical experimental results suggest that our proposed methods can offer substantial improvements over the baseline system and achieve performance competitive to or better than some of the existing noise robustness methods in speech recognition.

原文英語
頁(從 - 到)6769-6795
頁數27
期刊Multimedia Tools and Applications
74
發行號17
DOIs
出版狀態已發佈 - 2015 9月 1

ASJC Scopus subject areas

  • 軟體
  • 媒體技術
  • 硬體和架構
  • 電腦網路與通信

指紋

深入研究「Histogram equalization of contextual statistics of speech features for robust speech recognition」主題。共同形成了獨特的指紋。

引用此