Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition

Shih Hsiang Lin, Yao Ming Yeh, Berlin Chen

研究成果: 會議貢獻類型

摘要

The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.

原文英語
頁面87-92
頁數6
出版狀態已發佈 - 2007 十二月 1
事件2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 - Kyoto, 日本
持續時間: 2007 十二月 92007 十二月 13

其他

其他2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007
國家日本
城市Kyoto
期間07/12/907/12/13

指紋

Speech recognition
Probability distributions
Polynomials
Experiments

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Software
  • Artificial Intelligence

引用此文

Lin, S. H., Yeh, Y. M., & Chen, B. (2007). Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition. 87-92. 論文發表於 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, 日本.

Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition. / Lin, Shih Hsiang; Yeh, Yao Ming; Chen, Berlin.

2007. 87-92 論文發表於 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, 日本.

研究成果: 會議貢獻類型

Lin, SH, Yeh, YM & Chen, B 2007, 'Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition', 論文發表於 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, 日本, 07/12/9 - 07/12/13 頁 87-92.
Lin SH, Yeh YM, Chen B. Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition. 2007. 論文發表於 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, 日本.
Lin, Shih Hsiang ; Yeh, Yao Ming ; Chen, Berlin. / Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition. 論文發表於 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, 日本.6 p.
@conference{2b2876070cb5469394130baf4eb37672,
title = "Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition",
abstract = "The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.",
keywords = "Clustering, Histogram equalization, Polynomial regression, Robustness, Speech recognition",
author = "Lin, {Shih Hsiang} and Yeh, {Yao Ming} and Berlin Chen",
year = "2007",
month = "12",
day = "1",
language = "English",
pages = "87--92",
note = "2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 ; Conference date: 09-12-2007 Through 13-12-2007",

}

TY - CONF

T1 - Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition

AU - Lin, Shih Hsiang

AU - Yeh, Yao Ming

AU - Chen, Berlin

PY - 2007/12/1

Y1 - 2007/12/1

N2 - The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.

AB - The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.

KW - Clustering

KW - Histogram equalization

KW - Polynomial regression

KW - Robustness

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=44849096221&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=44849096221&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:44849096221

SP - 87

EP - 92

ER -