A study of topic modeling techniques for spoken document retrieval

Kuan Yu Chen*, Berlin Chen

*此作品的通信作者

研究成果: 會議貢獻類型會議論文同行評審

1 引文 斯高帕斯(Scopus)

摘要

This paper focuses on comparison of two common categories of topic modeling techniques for spoken document retrieval (SDR), namely document topic model (DTM) and word topic model (WTM). Apart from using the conventional unsupervised training strategy, we explore a supervised training strategy for estimating these topic models, assuming that user query logs along with click-through information of relevant documents can be utilized when building an SDR system. This attempt has the potential to associate relevant documents with queries even if they do not share any of the query words. Moreover, in order to lessen SDR performance degradation caused by imperfect speech recognition, we also leverage different levels of index features for topic modeling, including words, syllable-level units, and their combination. Experiments conducted on the TDT-2 SDR task show that the methods deduced from our proposed modeling framework are very promising when compared with a few existing retrieval approaches.

原文英語
頁面237-242
頁數6
出版狀態已發佈 - 2010
事件2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010 - Biopolis, 新加坡
持續時間: 2010 12月 142010 12月 17

其他

其他2nd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2010
國家/地區新加坡
城市Biopolis
期間2010/12/142010/12/17

ASJC Scopus subject areas

  • 電腦網路與通信
  • 資訊系統

指紋

深入研究「A study of topic modeling techniques for spoken document retrieval」主題。共同形成了獨特的指紋。

引用此