Leveraging relevance cues for language modeling in speech recognition

Berlin Chen*, Kuan Yu Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)

Abstract

Language modeling (LM), providing a principled mechanism to associate quantitative scores to sequences of words or tokens, has long been an interesting yet challenging problem in the field of speech and language processing. The n-gram model is still the predominant method, while a number of disparate LM methods, exploring either lexical co-occurrence or topic cues, have been developed to complement the n-gram model with some success. In this paper, we explore a novel language modeling framework built on top of the notion of relevance for speech recognition, where the relationship between a search history and the word being predicted is discovered through different granularities of semantic context for relevance modeling. Empirical experiments on a large vocabulary continuous speech recognition (LVCSR) task seem to demonstrate that the various language models deduced from our framework are very comparable to existing language models both in terms of perplexity and recognition error rate reductions.

Original languageEnglish
Pages (from-to)807-816
Number of pages10
JournalInformation Processing and Management
Volume49
Issue number4
DOIs
Publication statusPublished - 2013

Keywords

  • Information retrieval
  • Language model
  • Relevance model
  • Speech recognition
  • Topic model

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Leveraging relevance cues for language modeling in speech recognition'. Together they form a unique fingerprint.

Cite this