Extractive speech summarization using evaluation metric-related training criteria

Berlin Chen, Shih Hsiang Lin, Yu Mei Chang, Jia Wen Liu

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalInformation Processing and Management
Volume49
Issue number1
DOIs
Publication statusPublished - 2013 Jan 1

Fingerprint

evaluation
performance
Learning systems
broadcast
news
Summarization
Evaluation
school
learning

Keywords

  • Discriminative training
  • Evaluation metric
  • Imbalanced-data
  • Sentence ranking
  • Speech summarization

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

Cite this

Extractive speech summarization using evaluation metric-related training criteria. / Chen, Berlin; Lin, Shih Hsiang; Chang, Yu Mei; Liu, Jia Wen.

In: Information Processing and Management, Vol. 49, No. 1, 01.01.2013, p. 1-12.

Research output: Contribution to journalArticle

Chen, Berlin ; Lin, Shih Hsiang ; Chang, Yu Mei ; Liu, Jia Wen. / Extractive speech summarization using evaluation metric-related training criteria. In: Information Processing and Management. 2013 ; Vol. 49, No. 1. pp. 1-12.
@article{30a1dc3a16e34dd2be5dd597cc4ea1b6,
title = "Extractive speech summarization using evaluation metric-related training criteria",
abstract = "The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.",
keywords = "Discriminative training, Evaluation metric, Imbalanced-data, Sentence ranking, Speech summarization",
author = "Berlin Chen and Lin, {Shih Hsiang} and Chang, {Yu Mei} and Liu, {Jia Wen}",
year = "2013",
month = "1",
day = "1",
doi = "10.1016/j.ipm.2011.12.002",
language = "English",
volume = "49",
pages = "1--12",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",
number = "1",

}

TY - JOUR

T1 - Extractive speech summarization using evaluation metric-related training criteria

AU - Chen, Berlin

AU - Lin, Shih Hsiang

AU - Chang, Yu Mei

AU - Liu, Jia Wen

PY - 2013/1/1

Y1 - 2013/1/1

N2 - The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.

AB - The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.

KW - Discriminative training

KW - Evaluation metric

KW - Imbalanced-data

KW - Sentence ranking

KW - Speech summarization

UR - http://www.scopus.com/inward/record.url?scp=84870297564&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870297564&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2011.12.002

DO - 10.1016/j.ipm.2011.12.002

M3 - Article

AN - SCOPUS:84870297564

VL - 49

SP - 1

EP - 12

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

IS - 1

ER -