TY - GEN
T1 - Extractive speech summarization leveraging convolutional neural network techniques
AU - Tsai, Chun I.
AU - Hung, Hsiao Tsung
AU - Chen, Kuan Yu
AU - Chen, Berlin
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/2/7
Y1 - 2017/2/7
N2 - Extractive text or speech summarization endeavors to select representative sentences from a source document and assemble them into a concise summary, so as to help people to browse and assimilate the main theme of the document efficiently. The recent past has seen a surge of interest in developing deep learning- or deep neural network-based supervised methods for extractive text summarization. This paper presents a continuation of this line of research for speech summarization and its contributions are three-fold. First, we exploit an effective framework that integrates two convolutional neural networks (CNNs) and a multilayer perceptron (MLP) for summary sentence selection. Specifically, CNNs encode a given document-sentence pair into two discriminative vector embeddings separately, while MLP in turn takes the two embeddings of a document-sentence pair and their similarity measure as the input to induce a ranking score for each sentence. Second, the input of MLP is augmented by a rich set of prosodic and lexical features apart from those derived from CNNs. Third, the utility of our proposed summarization methods and several widely-used methods are extensively analyzed and compared. The empirical results seem to demonstrate the effectiveness of our summarization method in relation to several state-of-the-art methods.
AB - Extractive text or speech summarization endeavors to select representative sentences from a source document and assemble them into a concise summary, so as to help people to browse and assimilate the main theme of the document efficiently. The recent past has seen a surge of interest in developing deep learning- or deep neural network-based supervised methods for extractive text summarization. This paper presents a continuation of this line of research for speech summarization and its contributions are three-fold. First, we exploit an effective framework that integrates two convolutional neural networks (CNNs) and a multilayer perceptron (MLP) for summary sentence selection. Specifically, CNNs encode a given document-sentence pair into two discriminative vector embeddings separately, while MLP in turn takes the two embeddings of a document-sentence pair and their similarity measure as the input to induce a ranking score for each sentence. Second, the input of MLP is augmented by a rich set of prosodic and lexical features apart from those derived from CNNs. Third, the utility of our proposed summarization methods and several widely-used methods are extensively analyzed and compared. The empirical results seem to demonstrate the effectiveness of our summarization method in relation to several state-of-the-art methods.
KW - Convolutional neural network
KW - Deep learning
KW - Deep neural network
KW - Speech summarization
UR - http://www.scopus.com/inward/record.url?scp=85016007335&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85016007335&partnerID=8YFLogxK
U2 - 10.1109/SLT.2016.7846259
DO - 10.1109/SLT.2016.7846259
M3 - Conference contribution
AN - SCOPUS:85016007335
T3 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
SP - 158
EP - 164
BT - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016
Y2 - 13 December 2016 through 16 December 2016
ER -