TY - GEN
T1 - Leveraging distributional characteristics of modulation spectra for robust speech recognition
AU - Kao, Yu Chen
AU - Chen, Berlin
PY - 2012
Y1 - 2012
N2 - Modulation spectrum processing of speech features has recently become an active area of intensive research in the speech recognition community. As for normalization of modulation spectra, spectral histogram equalization (SHE) seems to be one of the most effective techniques that have been used to compensate the nonlinear distortion. In this paper, we investigate a novel use of polynomial-fitting techniques for modulation histogram equalization, which has the advantages of lower storage and time consumption when compared with the conventional SHE methods. Further, we also investigated the possibility of combining our approach with other temporal feature normalization methods. The automatic speech recognition (ASR) experiments were carried out on the Aurora-2 standard noise-robust ASR task. The performance of the proposed approach was thoroughly tested and verified by comparisons with the other popular modulation spectrum normalization methods, which suggests the utility of the proposed approach.
AB - Modulation spectrum processing of speech features has recently become an active area of intensive research in the speech recognition community. As for normalization of modulation spectra, spectral histogram equalization (SHE) seems to be one of the most effective techniques that have been used to compensate the nonlinear distortion. In this paper, we investigate a novel use of polynomial-fitting techniques for modulation histogram equalization, which has the advantages of lower storage and time consumption when compared with the conventional SHE methods. Further, we also investigated the possibility of combining our approach with other temporal feature normalization methods. The automatic speech recognition (ASR) experiments were carried out on the Aurora-2 standard noise-robust ASR task. The performance of the proposed approach was thoroughly tested and verified by comparisons with the other popular modulation spectrum normalization methods, which suggests the utility of the proposed approach.
KW - modulation spectrum
KW - robust speech recognition
KW - spectral histogram equalization
KW - temporal average
UR - http://www.scopus.com/inward/record.url?scp=84868600509&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868600509&partnerID=8YFLogxK
U2 - 10.1109/ISSPA.2012.6310476
DO - 10.1109/ISSPA.2012.6310476
M3 - Conference contribution
AN - SCOPUS:84868600509
SN - 9781467303828
T3 - 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012
SP - 120
EP - 125
BT - 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012
T2 - 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012
Y2 - 2 July 2012 through 5 July 2012
ER -