A position-aware language modeling framework for Extractive broadcast news speech summarization

Shih Hung Liu, Kuan Yu Chen, Yu Lun Hsieh, Berlin Chen*, Hsin Min Wang, Hsu Chun Yen, Wen Lian Hsu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

Extractive summarization, a process that automatically picks exemplary sentences from a text (or spoken) document with the goal of concisely conveying key information therein, has seen a surge of attention from scholars and practitioners recently. Using a language modeling (LM) approach for sentence selection has been proven effective for performing unsupervised extractive summarization. However, one of the major difficulties facing the LM approach is to model sentences and estimate their parameters more accurately for each text (or spoken) document. We extend this line of research and make the following contributions in this work. First, we propose a position-aware language modeling framework using various granularities of position-specific information to better estimate the sentence models involved in the summarization process. Second, we explore disparate ways to integrate the positional cues into relevance models through a pseudo-relevance feedback procedure. Third, we extensively evaluate various models originated from our proposed framework and several well-established unsupervised methods. Empirical evaluation conducted on a broadcast news summarization task further demonstrates performance merits of the proposed summarization methods.

Original languageEnglish
Article number27
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume16
Issue number4
DOIs
Publication statusPublished - 2017 Aug

Keywords

  • Extractive summarization
  • Positional language modeling
  • Relevance modeling
  • Speech information

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'A position-aware language modeling framework for Extractive broadcast news speech summarization'. Together they form a unique fingerprint.

Cite this