Extractive speech summarization leveraging convolutional neural network techniques

Chun I. Tsai, Hsiao Tsung Hung, Kuan Yu Chen, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)


Extractive text or speech summarization endeavors to select representative sentences from a source document and assemble them into a concise summary, so as to help people to browse and assimilate the main theme of the document efficiently. The recent past has seen a surge of interest in developing deep learning- or deep neural network-based supervised methods for extractive text summarization. This paper presents a continuation of this line of research for speech summarization and its contributions are three-fold. First, we exploit an effective framework that integrates two convolutional neural networks (CNNs) and a multilayer perceptron (MLP) for summary sentence selection. Specifically, CNNs encode a given document-sentence pair into two discriminative vector embeddings separately, while MLP in turn takes the two embeddings of a document-sentence pair and their similarity measure as the input to induce a ranking score for each sentence. Second, the input of MLP is augmented by a rich set of prosodic and lexical features apart from those derived from CNNs. Third, the utility of our proposed summarization methods and several widely-used methods are extensively analyzed and compared. The empirical results seem to demonstrate the effectiveness of our summarization method in relation to several state-of-the-art methods.

Original languageEnglish
Title of host publication2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9781509049035
Publication statusPublished - 2017 Feb 7
Event2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - San Diego, United States
Duration: 2016 Dec 132016 Dec 16

Publication series

Name2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings


Other2016 IEEE Workshop on Spoken Language Technology, SLT 2016
Country/TerritoryUnited States
CitySan Diego


  • Convolutional neural network
  • Deep learning
  • Deep neural network
  • Speech summarization

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Artificial Intelligence
  • Language and Linguistics
  • Computer Vision and Pattern Recognition
  • Computer Science Applications


Dive into the research topics of 'Extractive speech summarization leveraging convolutional neural network techniques'. Together they form a unique fingerprint.

Cite this