An effective contextual language modeling framework for speech summarization with augmented features

Shi Yan Weng, Tien Hong Lo, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)


Tremendous amounts of multimedia associated with speech information are driving an urgent need to develop efficient and effective automatic summarization methods. To this end, we have seen rapid progress in applying supervised deep neural network-based methods to extractive speech summarization. More recently, the Bidirectional Encoder Representations from Transformers (BERT) model was proposed and has achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. In view of this, we in this paper contextualize and enhance the state-of-the-art BERT-based model for speech summarization, while its contributions are at least three-fold. First, we explore the incorporation of confidence scores into sentence representations to see if such an attempt could help alleviate the negative effects caused by imperfect automatic speech recognition (ASR). Secondly, we also augment the sentence embeddings obtained from BERT with extra structural and linguistic features, such as sentence position and inverse document frequency (IDF) statistics. Finally, we validate the effectiveness of our proposed method on a benchmark dataset, in comparison to several classic and celebrated speech summarization methods.

Original languageEnglish
Title of host publication28th European Signal Processing Conference, EUSIPCO 2020 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Number of pages5
ISBN (Electronic)9789082797053
Publication statusPublished - 2021 Jan 24
Event28th European Signal Processing Conference, EUSIPCO 2020 - Amsterdam, Netherlands
Duration: 2020 Aug 242020 Aug 28

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491


Conference28th European Signal Processing Conference, EUSIPCO 2020


  • BERT
  • Confidence score
  • Extractive speech summarization
  • Speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'An effective contextual language modeling framework for speech summarization with augmented features'. Together they form a unique fingerprint.

Cite this