A probabilistic generative framework for extractive broadcast news speech summarization

Yi Ting Chen, Berlin Chen, Hsin Min Wang

研究成果: 雜誌貢獻期刊論文同行評審

30 引文 斯高帕斯(Scopus)

摘要

In this paper, we consider extractive summarization of broadcast news speech and propose a unified probabilistic generative framework that combines the sentence generative probability and the sentence prior probability for sentence ranking. Each sentence of a spoken document to be summarized is treated as a probabilistic generative model for predicting the document. Two matching strategies, namely literal term matching and concept matching, are thoroughly investigated. We explore the use of the language model (LM) and the relevance model (RM) for literal term matching, while the sentence topical mixture model (STMM) and the word topical mixture model (WTMM) are used for concept matching. In addition, the lexical and prosodic features, as well as the relevance information of spoken sentences, are properly incorporated for the estimation of the sentence prior probability. An elegant feature of our proposed framework is that both the sentence generative probability and the sentence prior probability can be estimated in an unsupervised manner, without the need for handcrafted document-summary pairs. The experiments were performed on Chinese broadcast news collected in Taiwan, and very encouraging results were obtained.

原文英語
文章編號4717223
頁(從 - 到)95-106
頁數12
期刊IEEE Transactions on Audio, Speech and Language Processing
17
發行號1
DOIs
出版狀態已發佈 - 2009 一月

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

指紋 深入研究「A probabilistic generative framework for extractive broadcast news speech summarization」主題。共同形成了獨特的指紋。

引用此