Extractive speech summarization using evaluation metric-related training criteria

Berlin Chen, Shih Hsiang Lin, Yu Mei Chang, Jia Wen Liu

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalInformation Processing and Management
Volume49
Issue number1
DOIs
Publication statusPublished - 2013 Jan

Keywords

  • Discriminative training
  • Evaluation metric
  • Imbalanced-data
  • Sentence ranking
  • Speech summarization

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Extractive speech summarization using evaluation metric-related training criteria'. Together they form a unique fingerprint.

Cite this