Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognition

Hsin Ju Hsieh, Berlin Chen, Jeih Weih Hung

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Histogram equalization (HEQ) of acoustic features has received considerable attention in the area of robust speech recognition because of its relative simplicity and good empirical performance. This paper presents a novel HEQbased feature extraction approach that performs equalization in both acoustic frequency and modulation frequency domains for obtaining better noise-robust features. In particular, the real and imaginary acoustic spectra are first individually transformed to the modulation domain via discrete Fourier transform (DFT). The HEQ process is then carried on the corresponding magnitude modulation spectra so as to compensate for the noise distortions. Finally, the equalized modulation spectra are converted back to form the real and imaginary acoustic spectra, respectively. By doing so, we can enhance not only the magnitude but also the phase components of the acoustic spectra, and thereby create more noise-robust cepstral features. The experiments conducted on the Aurora-2 clean-condition database and task reveal that the presented approach delivers superior recognition accuracy in comparison with some other HEQ-related methods and the well-known advanced front-end (AFE) extraction scheme, which supports the potential utility of this novel approach.

Original languageEnglish
Pages (from-to)2997-3001
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2013

Fingerprint

Robust Speech Recognition
Histogram Equalization
Speech recognition
Acoustic noise
Acoustics
Modulation
Discrete Fourier transform
Equalization
Frequency modulation
Discrete Fourier transforms
Feature Extraction
Frequency Domain
Feature extraction
Simplicity
Speech Recognition
Experiment
Experiments

Keywords

  • Automatic speech recognition
  • Feature extraction
  • Histogram equalization
  • Modulation spectrum
  • Noise robustness

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

@article{b9f1b2b8cbdd4646a8f9b6487fe49b90,
title = "Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognition",
abstract = "Histogram equalization (HEQ) of acoustic features has received considerable attention in the area of robust speech recognition because of its relative simplicity and good empirical performance. This paper presents a novel HEQbased feature extraction approach that performs equalization in both acoustic frequency and modulation frequency domains for obtaining better noise-robust features. In particular, the real and imaginary acoustic spectra are first individually transformed to the modulation domain via discrete Fourier transform (DFT). The HEQ process is then carried on the corresponding magnitude modulation spectra so as to compensate for the noise distortions. Finally, the equalized modulation spectra are converted back to form the real and imaginary acoustic spectra, respectively. By doing so, we can enhance not only the magnitude but also the phase components of the acoustic spectra, and thereby create more noise-robust cepstral features. The experiments conducted on the Aurora-2 clean-condition database and task reveal that the presented approach delivers superior recognition accuracy in comparison with some other HEQ-related methods and the well-known advanced front-end (AFE) extraction scheme, which supports the potential utility of this novel approach.",
keywords = "Automatic speech recognition, Feature extraction, Histogram equalization, Modulation spectrum, Noise robustness",
author = "Hsieh, {Hsin Ju} and Berlin Chen and Hung, {Jeih Weih}",
year = "2013",
language = "English",
pages = "2997--3001",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognition

AU - Hsieh, Hsin Ju

AU - Chen, Berlin

AU - Hung, Jeih Weih

PY - 2013

Y1 - 2013

N2 - Histogram equalization (HEQ) of acoustic features has received considerable attention in the area of robust speech recognition because of its relative simplicity and good empirical performance. This paper presents a novel HEQbased feature extraction approach that performs equalization in both acoustic frequency and modulation frequency domains for obtaining better noise-robust features. In particular, the real and imaginary acoustic spectra are first individually transformed to the modulation domain via discrete Fourier transform (DFT). The HEQ process is then carried on the corresponding magnitude modulation spectra so as to compensate for the noise distortions. Finally, the equalized modulation spectra are converted back to form the real and imaginary acoustic spectra, respectively. By doing so, we can enhance not only the magnitude but also the phase components of the acoustic spectra, and thereby create more noise-robust cepstral features. The experiments conducted on the Aurora-2 clean-condition database and task reveal that the presented approach delivers superior recognition accuracy in comparison with some other HEQ-related methods and the well-known advanced front-end (AFE) extraction scheme, which supports the potential utility of this novel approach.

AB - Histogram equalization (HEQ) of acoustic features has received considerable attention in the area of robust speech recognition because of its relative simplicity and good empirical performance. This paper presents a novel HEQbased feature extraction approach that performs equalization in both acoustic frequency and modulation frequency domains for obtaining better noise-robust features. In particular, the real and imaginary acoustic spectra are first individually transformed to the modulation domain via discrete Fourier transform (DFT). The HEQ process is then carried on the corresponding magnitude modulation spectra so as to compensate for the noise distortions. Finally, the equalized modulation spectra are converted back to form the real and imaginary acoustic spectra, respectively. By doing so, we can enhance not only the magnitude but also the phase components of the acoustic spectra, and thereby create more noise-robust cepstral features. The experiments conducted on the Aurora-2 clean-condition database and task reveal that the presented approach delivers superior recognition accuracy in comparison with some other HEQ-related methods and the well-known advanced front-end (AFE) extraction scheme, which supports the potential utility of this novel approach.

KW - Automatic speech recognition

KW - Feature extraction

KW - Histogram equalization

KW - Modulation spectrum

KW - Noise robustness

UR - http://www.scopus.com/inward/record.url?scp=84906269981&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906269981&partnerID=8YFLogxK

M3 - Article

SP - 2997

EP - 3001

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -