Latent topic modeling of word co-occurrence information for spoken document retrieval

Berlin Chen*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Citations (Scopus)

Abstract

In this paper, we present a word topic model (WTM) approach, discovering the co-occurrence relationship between words as well as the long-span latent topic information, for spoken document retrieval (SDR). A given document as a whole is modeled as a composite WTM model for generating an observed query. The underlying characteristics and different kinds of model structures are extensively investigated, while the performance of WTM is thoroughly analyzed and verified by comparison with a few existing retrieval models on the TDT-2 SDR task. We also attempt to incorporate part-of-speech (POS) weighting into the representations of the query observations and the WTM models for obtaining better retrieval performance.

Original languageEnglish
Title of host publication2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009
Pages3961-3964
Number of pages4
DOIs
Publication statusPublished - 2009
Event2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009 - Taipei, Taiwan
Duration: 2009 Apr 192009 Apr 24

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
Country/TerritoryTaiwan
CityTaipei
Period2009/04/192009/04/24

Keywords

  • Language model
  • Probabilistic latent semantic analysis
  • Spoken document retrieval
  • Word topic model

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Latent topic modeling of word co-occurrence information for spoken document retrieval'. Together they form a unique fingerprint.

Cite this