TY - JOUR
T1 - Positional language modeling for extractive broadcast news speech summarization
AU - Liu, Shih Hung
AU - Chen, Kuan Yu
AU - Chen, Berlin
AU - Wang, Hsin Min
AU - Yen, Hsu Chun
AU - Hsu, Wen Lian
N1 - Publisher Copyright:
Copyright © 2015 ISCA.
PY - 2015
Y1 - 2015
N2 - Extractive summarization, with the intention of automatically selecting a set of representative sentences from a text (or spoken) document so as to concisely express the most important theme of the document, has been an active area of experimentation and development. A recent trend of research is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing extractive summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and estimate their parameters more accurately for each text (or spoken) document to be summarized. This paper extends this line of research and its contributions are three-fold. First, we propose a positional language modeling framework using different granularities of position-specific information to better estimate the sentence models involved in summarization. Second, we also explore to integrate the positional cues into relevance modeling through a pseudo-relevance feedback procedure. Third, the utilities of the various methods originated from our proposed framework and several wellestablished unsupervised methods are analyzed and compared extensively. Empirical evaluations conducted on a broadcast news summarization task seem to demonstrate the performance merits of our summarization methods.
AB - Extractive summarization, with the intention of automatically selecting a set of representative sentences from a text (or spoken) document so as to concisely express the most important theme of the document, has been an active area of experimentation and development. A recent trend of research is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing extractive summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and estimate their parameters more accurately for each text (or spoken) document to be summarized. This paper extends this line of research and its contributions are three-fold. First, we propose a positional language modeling framework using different granularities of position-specific information to better estimate the sentence models involved in summarization. Second, we also explore to integrate the positional cues into relevance modeling through a pseudo-relevance feedback procedure. Third, the utilities of the various methods originated from our proposed framework and several wellestablished unsupervised methods are analyzed and compared extensively. Empirical evaluations conducted on a broadcast news summarization task seem to demonstrate the performance merits of our summarization methods.
KW - Extractive broadcast news summarization
KW - Positional language modeling
KW - Relevance modeling
UR - http://www.scopus.com/inward/record.url?scp=84959107792&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959107792&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84959107792
SN - 2308-457X
VL - 2015-January
SP - 2729
EP - 2733
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Y2 - 6 September 2015 through 10 September 2015
ER -