TY - GEN
T1 - Cluster-based polynomial-fit histogram equalization (CPHEQ) for robust speech recognition
AU - Lin, Shih Hsiang
AU - Yeh, Yao Ming
AU - Chen, Berlin
PY - 2007
Y1 - 2007
N2 - Noise robustness is one of the primary challenges facing most automatic speech recognition (ASR) systems. A vast amount of research efforts on preventing the degradation of ASR performance under various noisy environments have been made during the past several years. In this paper, we consider the use of histogram equalization (HEQ) for robust ASR. In contrast to conventional methods, a novel data fitting method based on polynomial regression was presented to efficiently approximate the inverse of the cumulative density functions of speech feature vectors for HEQ. Moreover, a more elaborate attempt of using such polynomial regression models to directly characterizing the relationship between the speech feature vectors and their corresponding probability distributions, under various noise conditions, was proposed as well. All experiments were carried out on the Aurora-2 database and task. The performance of the presented methods were extensively tested and verified by comparison with the other methods. Experimental results shown that for cleancondition training, our method achieved a considerable word error rate reduction over the baseline system, and also significantly outperformed the other methods.
AB - Noise robustness is one of the primary challenges facing most automatic speech recognition (ASR) systems. A vast amount of research efforts on preventing the degradation of ASR performance under various noisy environments have been made during the past several years. In this paper, we consider the use of histogram equalization (HEQ) for robust ASR. In contrast to conventional methods, a novel data fitting method based on polynomial regression was presented to efficiently approximate the inverse of the cumulative density functions of speech feature vectors for HEQ. Moreover, a more elaborate attempt of using such polynomial regression models to directly characterizing the relationship between the speech feature vectors and their corresponding probability distributions, under various noise conditions, was proposed as well. All experiments were carried out on the Aurora-2 database and task. The performance of the presented methods were extensively tested and verified by comparison with the other methods. Experimental results shown that for cleancondition training, our method achieved a considerable word error rate reduction over the baseline system, and also significantly outperformed the other methods.
KW - Histogram equalization
KW - Noise robustness
KW - Polynomial regression model
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=56149120978&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=56149120978&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:56149120978
SN - 9781605603162
T3 - International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
SP - 197
EP - 200
BT - International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
PB - Unavailable
T2 - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
Y2 - 27 August 2007 through 31 August 2007
ER -