Leveraging topical and positional cues for language modeling in speech recognition

Hsuan Sheng Chiu, Kuan Yu Chen, Berlin Chen

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

This paper investigates language modeling with topical and positional information for large vocabulary continuous speech recognition. We first compare among a few topic models both theoretically and empirically, including document topic models and word topic models. On the other hand, since for some spoken documents such as broadcast news stories, the composition and the word usage of documents of the same style are usually similar, the documents hence can be separated into partitions consisting of identical rhetoric or topic styles by the literary structures, like introductory remarks, elucidations of methodology or affairs, conclusions of the articles, references or footnotes of reporters, etc. We hence present two position-dependent language models for speech recognition by integrating word positional information into the exiting n-gram and topic models. The experiments conducted on broadcast news transcription seem to indicate that such position-dependent models obtain comparable results to the existing n-gram and topic models.

Original languageEnglish
Pages (from-to)1465-1481
Number of pages17
JournalMultimedia Tools and Applications
Volume72
Issue number2
DOIs
Publication statusPublished - 2014 Sep

    Fingerprint

Keywords

  • Language model
  • Language model adaptation
  • Positional information
  • Speech recognition
  • Topical information

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this