Discriminative language modeling for speech recognition with relevance information

Berlin Chen*, Jia Wen Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Discriminative language modeling (DLM) attempts to improve speech recognition performance by reranking the recognition hypotheses output from a baseline system. Most of the existing DLM methods assume that the reranking task can be treated as a linear discrimination problem and all testing utterances share the same parameter vector for reranking of hypotheses. However, the latter assumption sometimes results in a trained DLM model with weak generalizability and unsatisfactory performance. In view of this problem, we hence propose a relevance-based DLM (RDLM) framework that can efficiently infer the DLM model parameters of each testing utterance on-the-fly for better recognition performance. The structures and characteristics of the RDLM framework are extensively investigated, while the performance is thoroughly analyzed and verified by comparison with the existing DLM methods.

Original languageEnglish
Title of host publicationElectronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011
DOIs
Publication statusPublished - 2011
Event2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011 - Barcelona, Spain
Duration: 2011 Jul 112011 Jul 15

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Other

Other2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011
Country/TerritorySpain
CityBarcelona
Period2011/07/112011/07/15

Keywords

  • Discriminative Training
  • Language Modeling
  • Perceptron Method
  • Reranking
  • Speech Recognition

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Discriminative language modeling for speech recognition with relevance information'. Together they form a unique fingerprint.

Cite this