A comparative study of probabilistic ranking models for Chinese spoken document summarization

Shih Hsiang Lin, Berlin Chen*, Hsin Min Wang

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

27 引文 斯高帕斯(Scopus)

摘要

Extractive document summarization automatically selects a number of indicative sentences, passages, or paragraphs from an original document according to a target summarization ratio, and sequences them to form a concise summary. In this article, we present a comparative study of various probabilistic ranking models for spoken document summarization, including supervised classification-based summarizers and unsupervised probabilistic generative summarizers. We also investigate the use of unsupervised summarizers to improve the performance of supervised summarizers when manual labels are not available for training the latter. A novel training data selection approach that leverages the relevance information of spoken sentences to select reliable document-summary pairs derived by the probabilistic generative summarizers is explored for training the classification-based summarizers. Encouraging initial results on Mandarin Chinese broadcast news data are demonstrated.

原文英語
文章編號3
期刊ACM Transactions on Asian Language Information Processing
8
發行號1
DOIs
出版狀態已發佈 - 2009 3月 1

ASJC Scopus subject areas

  • 一般電腦科學

指紋

深入研究「A comparative study of probabilistic ranking models for Chinese spoken document summarization」主題。共同形成了獨特的指紋。

引用此