Weighted matrix factorization for spoken document retrieval

Kuan Yu Chen, Hsin Min Wang, Berlin Chen, Hsin Hsi Chen

研究成果: 書貢獻/報告類型會議論文篇章

6 引文 斯高帕斯(Scopus)

摘要

Since more and more multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research subject in the past two decades. Recently, topic models have been successfully used in SDR as well as general information retrieval (IR). These models fall into two categories: probabilistic topic models (PTM) and non-probabilistic topic models (NPTM). One major difference between PTM and NPTM is that the former only takes the words occurring in a document into account, whereas the latter, such as latent semantic analysis (LSA), explicitly models all the words in the vocabulary (including both occurring and non-occurring words). We believe that the non-occurring words can provide additional information that is also useful for SDR. However, to our best knowledge, there is a dearth of work investigating the effectiveness of the non-occurring words for SDR and IR. In order to make effective use of those non-occurring words of documents for semantic analysis, we propose a weighted matrix factorization (WMF) framework, in which the impact of the non-occurring words on the semantic analysis can be modulated properly. The results of SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection highlight the performance merits of our proposed framework when compared to several existing topic models.

原文英語
主出版物標題2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
頁面8530-8534
頁數5
DOIs
出版狀態已發佈 - 2013 十月 18
事件2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, 加拿大
持續時間: 2013 五月 262013 五月 31

出版系列

名字ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(列印)1520-6149

其他

其他2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
國家加拿大
城市Vancouver, BC
期間2013/05/262013/05/31

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

指紋 深入研究「Weighted matrix factorization for spoken document retrieval」主題。共同形成了獨特的指紋。

引用此