TY - JOUR
T1 - Robust speech recognition using spatial-temporal feature distribution characteristics
AU - Chen, Berlin
AU - Chen, Wei Hau
AU - Lin, Shih Hsiang
AU - Chu, Wen Yi
N1 - Funding Information:
This work was supported in part by National Science Council, Taiwan , under Grants NSC 99-2221-E-003-017-MY3 , NSC 99-2515-S-003-004 , and NSC 98-2631-S-003-002 ; and National Taiwan Normal University , under Grant 99T3060-1 .
PY - 2011/5/1
Y1 - 2011/5/1
N2 - Histogram equalization (HEQ) is one of the most efficient and effective techniques that have been used to reduce the mismatch between training and test acoustic conditions. However, most of the current HEQ methods are merely performed in a dimension-wise manner and without allowing for the contextual relationships between consecutive speech frames. In this paper, we present several novel HEQ approaches that exploit spatial-temporal feature distribution characteristics for speech feature normalization. The automatic speech recognition (ASR) experiments were carried out on the Aurora-2 standard noise-robust ASR task. The performance of the presented approaches was thoroughly tested and verified by comparisons with the other popular HEQ methods. The experimental results show that for clean-condition training, our approaches yield a significant word error rate reduction over the baseline system, and also give competitive performance relative to the other HEQ methods compared in this paper.
AB - Histogram equalization (HEQ) is one of the most efficient and effective techniques that have been used to reduce the mismatch between training and test acoustic conditions. However, most of the current HEQ methods are merely performed in a dimension-wise manner and without allowing for the contextual relationships between consecutive speech frames. In this paper, we present several novel HEQ approaches that exploit spatial-temporal feature distribution characteristics for speech feature normalization. The automatic speech recognition (ASR) experiments were carried out on the Aurora-2 standard noise-robust ASR task. The performance of the presented approaches was thoroughly tested and verified by comparisons with the other popular HEQ methods. The experimental results show that for clean-condition training, our approaches yield a significant word error rate reduction over the baseline system, and also give competitive performance relative to the other HEQ methods compared in this paper.
KW - Aurora-2
KW - Histogram equalization
KW - Noise robustness
KW - Spatial-temporal distribution characteristics
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=79951960110&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79951960110&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2011.01.016
DO - 10.1016/j.patrec.2011.01.016
M3 - Article
AN - SCOPUS:79951960110
SN - 0167-8655
VL - 32
SP - 919
EP - 926
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
IS - 7
ER -