TY - GEN
T1 - Extractive Chinese spoken document summarization using probabilistic ranking models
AU - Chen, Yi Ting
AU - Yu, Suhan
AU - Wang, Hsin Min
AU - Chen, Berlin
PY - 2006
Y1 - 2006
N2 - The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.
AB - The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.
KW - Hidden Markov model
KW - Probabilistic ranking
KW - Relevance model
KW - Speech recognition
KW - Spoken document summarization
UR - http://www.scopus.com/inward/record.url?scp=77249109806&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77249109806&partnerID=8YFLogxK
U2 - 10.1007/11939993_67
DO - 10.1007/11939993_67
M3 - Conference contribution
AN - SCOPUS:77249109806
SN - 3540496653
SN - 9783540496656
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 660
EP - 671
BT - Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings
T2 - 5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006
Y2 - 13 December 2006 through 16 December 2006
ER -