Abstract
This paper investigates language modeling with topical and positional information for large vocabulary continuous speech recognition. We first compare among a few topic models both theoretically and empirically, including document topic models and word topic models. On the other hand, since for some spoken documents such as broadcast news stories, the composition and the word usage of documents of the same style are usually similar, the documents hence can be separated into partitions consisting of identical rhetoric or topic styles by the literary structures, like introductory remarks, elucidations of methodology or affairs, conclusions of the articles, references or footnotes of reporters, etc. We hence present two position-dependent language models for speech recognition by integrating word positional information into the exiting n-gram and topic models. The experiments conducted on broadcast news transcription seem to indicate that such position-dependent models obtain comparable results to the existing n-gram and topic models.
Original language | English |
---|---|
Pages (from-to) | 1465-1481 |
Number of pages | 17 |
Journal | Multimedia Tools and Applications |
Volume | 72 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2014 Sep |
Keywords
- Language model
- Language model adaptation
- Positional information
- Speech recognition
- Topical information
ASJC Scopus subject areas
- Software
- Media Technology
- Hardware and Architecture
- Computer Networks and Communications