Hierarchical topic organization and visual presentation of spoken documents using Probabilistic Latent Semantic Analysis(PLSA) for efficient retrieval/browsing applications

Te Hsuan Li*, Ming Han Lee, Berlin Chen, Lin Shan Lee

*此作品的通信作者

研究成果: 會議貢獻類型會議論文同行評審

5 引文 斯高帕斯(Scopus)

摘要

The most attractive form of future network content will be multi-media including speech information, and such speech information usually carries the core concepts for the content. As a result, the spoken documents associated with the multi-media content very possibly can serve as the key for retrieval and browsing. This paper presents a new approach of hierarchical topic organization and visual presentation of spoken documents for such a purpose based on the Probabilistic Latent Semantic Analysis (PLSA). With this approach the spoken documents can be organized into a two-dimensional tree (or multi-layered map) of topic clusters, and the user can very efficiently retrieve or browse the network content or associated spoken documents. Different from the conventional document clustering approaches, with PLSA the relationships among the topic clusters and the appropriate terms as the topic labels can be very well derived. An initial prototype system with Chinese broadcast news as the example spoken documents including automatic generation of titles and summaries and retrieval/browsing functionalities is also presented. Choice of different units other than words to be used as the terms in the processing is also considered in the system based on the special structure of the Chinese language.

原文英語
頁面625-628
頁數4
出版狀態已發佈 - 2005 十二月 1
事件9th European Conference on Speech Communication and Technology - Lisbon, 葡萄牙
持續時間: 2005 九月 42005 九月 8

其他

其他9th European Conference on Speech Communication and Technology
國家/地區葡萄牙
城市Lisbon
期間2005/09/042005/09/08

ASJC Scopus subject areas

  • 工程 (全部)

指紋

深入研究「Hierarchical topic organization and visual presentation of spoken documents using Probabilistic Latent Semantic Analysis(PLSA) for efficient retrieval/browsing applications」主題。共同形成了獨特的指紋。

引用此