Exploring the use of speech features and their corresponding distribution characteristics for robust speech recognition

Shih Hsiang Lin, Berlin Chen, Yao Ming Yeh

    Research output: Contribution to journalArticlepeer-review

    17 Citations (Scopus)

    Abstract

    The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Several methods have been proposed to improve ASR robustness over the last few decades. The related literature can be generally classified into two categories according to whether the methods are directly based on the feature domain or consider some specific statistical feature characteristics. In this paper, we present a polynomial regression approach that has the merit of directly characterizing the relationship between speech features and their corresponding distribution characteristics to compensate for noise interference. The proposed approach and a variant were thoroughly investigated and compared with a few existing noise robustness approaches. All experiments were conducted using the Aurora-2 database and task. The results show that our approaches achieve considerable word error rate reductions over the baseline system and are comparable to most of the conventional robustness approaches discussed in this paper.

    Original languageEnglish
    Article number4740142
    Pages (from-to)84-94
    Number of pages11
    JournalIEEE Transactions on Audio, Speech and Language Processing
    Volume17
    Issue number1
    DOIs
    Publication statusPublished - 2009 Jan 1

    Keywords

    • Clustering
    • Histogram equalization
    • Polynomialregression
    • Robustness
    • Speech recognition

    ASJC Scopus subject areas

    • Acoustics and Ultrasonics
    • Electrical and Electronic Engineering

    Fingerprint Dive into the research topics of 'Exploring the use of speech features and their corresponding distribution characteristics for robust speech recognition'. Together they form a unique fingerprint.

    Cite this