Leveraging topical and positional cues for language modeling in speech recognition

Hsuan Sheng Chiu, Kuan Yu Chen, Berlin Chen*

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

2 引文 斯高帕斯(Scopus)

摘要

This paper investigates language modeling with topical and positional information for large vocabulary continuous speech recognition. We first compare among a few topic models both theoretically and empirically, including document topic models and word topic models. On the other hand, since for some spoken documents such as broadcast news stories, the composition and the word usage of documents of the same style are usually similar, the documents hence can be separated into partitions consisting of identical rhetoric or topic styles by the literary structures, like introductory remarks, elucidations of methodology or affairs, conclusions of the articles, references or footnotes of reporters, etc. We hence present two position-dependent language models for speech recognition by integrating word positional information into the exiting n-gram and topic models. The experiments conducted on broadcast news transcription seem to indicate that such position-dependent models obtain comparable results to the existing n-gram and topic models.

原文英語
頁(從 - 到)1465-1481
頁數17
期刊Multimedia Tools and Applications
72
發行號2
DOIs
出版狀態已發佈 - 2014 9月

ASJC Scopus subject areas

  • 軟體
  • 媒體技術
  • 硬體和架構
  • 電腦網路與通信

指紋

深入研究「Leveraging topical and positional cues for language modeling in speech recognition」主題。共同形成了獨特的指紋。

引用此