Spoken document summarization using relevant information

Yi Ting Chen, Shih Hsiang Lin, Hsin Min Wang, Berlin Chen

Research output: Contribution to conferencePaper

2 Citations (Scopus)

Abstract

Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.

Original languageEnglish
Pages189-194
Number of pages6
Publication statusPublished - 2007 Dec 1
Event2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 - Kyoto, Japan
Duration: 2007 Dec 92007 Dec 13

Other

Other2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007
CountryJapan
CityKyoto
Period07/12/907/12/13

Fingerprint

Hidden Markov models
Experiments

Keywords

  • Extractive summarization
  • Hidden Markov model
  • Probabilistic generative model
  • Relevance model
  • Relevant document
  • Spoken document summarization

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Software
  • Artificial Intelligence

Cite this

Chen, Y. T., Lin, S. H., Wang, H. M., & Chen, B. (2007). Spoken document summarization using relevant information. 189-194. Paper presented at 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, Japan.

Spoken document summarization using relevant information. / Chen, Yi Ting; Lin, Shih Hsiang; Wang, Hsin Min; Chen, Berlin.

2007. 189-194 Paper presented at 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, Japan.

Research output: Contribution to conferencePaper

Chen, YT, Lin, SH, Wang, HM & Chen, B 2007, 'Spoken document summarization using relevant information' Paper presented at 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, Japan, 07/12/9 - 07/12/13, pp. 189-194.
Chen YT, Lin SH, Wang HM, Chen B. Spoken document summarization using relevant information. 2007. Paper presented at 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, Japan.
Chen, Yi Ting ; Lin, Shih Hsiang ; Wang, Hsin Min ; Chen, Berlin. / Spoken document summarization using relevant information. Paper presented at 2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007, Kyoto, Japan.6 p.
@conference{cc2729f69dbf43f0a3b215d114682296,
title = "Spoken document summarization using relevant information",
abstract = "Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.",
keywords = "Extractive summarization, Hidden Markov model, Probabilistic generative model, Relevance model, Relevant document, Spoken document summarization",
author = "Chen, {Yi Ting} and Lin, {Shih Hsiang} and Wang, {Hsin Min} and Berlin Chen",
year = "2007",
month = "12",
day = "1",
language = "English",
pages = "189--194",
note = "2007 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 ; Conference date: 09-12-2007 Through 13-12-2007",

}

TY - CONF

T1 - Spoken document summarization using relevant information

AU - Chen, Yi Ting

AU - Lin, Shih Hsiang

AU - Wang, Hsin Min

AU - Chen, Berlin

PY - 2007/12/1

Y1 - 2007/12/1

N2 - Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.

AB - Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.

KW - Extractive summarization

KW - Hidden Markov model

KW - Probabilistic generative model

KW - Relevance model

KW - Relevant document

KW - Spoken document summarization

UR - http://www.scopus.com/inward/record.url?scp=44849126739&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=44849126739&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:44849126739

SP - 189

EP - 194

ER -