Extractive Chinese spoken document summarization using probabilistic ranking models

Yi Ting Chen, Suhan Yu, Hsin Min Wang, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.

Original languageEnglish
Title of host publicationChinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings
Pages660-671
Number of pages12
DOIs
Publication statusPublished - 2006 Dec 1
Event5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006 - Singapore, Singapore
Duration: 2006 Dec 132006 Dec 16

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4274 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006
CountrySingapore
CitySingapore
Period06/12/1306/12/16

Fingerprint

Summarization
Hidden Markov models
Ranking
Markov Model
Probabilistic Modeling
Model
Taiwan
Broadcast
Likelihood
Statistical Models
Experiments
Target
Experiment

Keywords

  • Hidden Markov model
  • Probabilistic ranking
  • Relevance model
  • Speech recognition
  • Spoken document summarization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Chen, Y. T., Yu, S., Wang, H. M., & Chen, B. (2006). Extractive Chinese spoken document summarization using probabilistic ranking models. In Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings (pp. 660-671). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4274 LNAI). https://doi.org/10.1007/11939993_67

Extractive Chinese spoken document summarization using probabilistic ranking models. / Chen, Yi Ting; Yu, Suhan; Wang, Hsin Min; Chen, Berlin.

Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings. 2006. p. 660-671 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4274 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, YT, Yu, S, Wang, HM & Chen, B 2006, Extractive Chinese spoken document summarization using probabilistic ranking models. in Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4274 LNAI, pp. 660-671, 5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006, Singapore, Singapore, 06/12/13. https://doi.org/10.1007/11939993_67
Chen YT, Yu S, Wang HM, Chen B. Extractive Chinese spoken document summarization using probabilistic ranking models. In Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings. 2006. p. 660-671. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11939993_67
Chen, Yi Ting ; Yu, Suhan ; Wang, Hsin Min ; Chen, Berlin. / Extractive Chinese spoken document summarization using probabilistic ranking models. Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings. 2006. pp. 660-671 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{1f0de2a2079040ef9bd8ea08a5a1191c,
title = "Extractive Chinese spoken document summarization using probabilistic ranking models",
abstract = "The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.",
keywords = "Hidden Markov model, Probabilistic ranking, Relevance model, Speech recognition, Spoken document summarization",
author = "Chen, {Yi Ting} and Suhan Yu and Wang, {Hsin Min} and Berlin Chen",
year = "2006",
month = "12",
day = "1",
doi = "10.1007/11939993_67",
language = "English",
isbn = "3540496653",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "660--671",
booktitle = "Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings",

}

TY - GEN

T1 - Extractive Chinese spoken document summarization using probabilistic ranking models

AU - Chen, Yi Ting

AU - Yu, Suhan

AU - Wang, Hsin Min

AU - Chen, Berlin

PY - 2006/12/1

Y1 - 2006/12/1

N2 - The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.

AB - The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.

KW - Hidden Markov model

KW - Probabilistic ranking

KW - Relevance model

KW - Speech recognition

KW - Spoken document summarization

UR - http://www.scopus.com/inward/record.url?scp=77249109806&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77249109806&partnerID=8YFLogxK

U2 - 10.1007/11939993_67

DO - 10.1007/11939993_67

M3 - Conference contribution

AN - SCOPUS:77249109806

SN - 3540496653

SN - 9783540496656

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 660

EP - 671

BT - Chinese Spoken Language Processing - 5th International Symposium, ISCSLP 2006, Proceedings

ER -