Exploring word Mover's distance and semantic-aware embedding techniques for extractive broadcast news summarization

Shih Hung Liu, Kuan Yu Chen, Yu Lun Hsieh, Berlin Chen, Hsin Min Wang, Hsu Chun Yen, Wen Lian Hsu

Research output: Contribution to journalConference articlepeer-review

6 Citations (Scopus)

Abstract

Extractive summarization is a process that manages to select the most salient sentences from a document (or a set of documents) and subsequently assemble them to form an informative summary, facilitating users to browse and assimilate the main theme of the document efficiently. Our work in this paper continues this general line of research and its main contributions are two-fold. First, we explore to leverage the recently proposed word mover's distance (WMD) metric, in conjunction with semantic-aware continuous space representations of words, to authentically capture finer-grained sentence-to-document and/or sentence-to-sentence semantic relatedness for effective use in the summarization process. Second, we investigate to combine our proposed approach with several state-of-the-art summarization methods, which originally adopted the conventional term-overlap or bag-ofwords (BOW) approaches for similarity calculation. A series of experiments conducted on a typical broadcast news summarization task seem to suggest the performance merits of our proposed approach, in comparison to the mainstream methods.

Original languageEnglish
Pages (from-to)670-674
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume08-12-September-2016
DOIs
Publication statusPublished - 2016
Event17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States
Duration: 2016 Sept 82016 Sept 16

Keywords

  • Extractive summarization
  • Markov random walk
  • Word mover's distance
  • Word representation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Exploring word Mover's distance and semantic-aware embedding techniques for extractive broadcast news summarization'. Together they form a unique fingerprint.

Cite this