TY - JOUR
T1 - Exploring word Mover's distance and semantic-aware embedding techniques for extractive broadcast news summarization
AU - Liu, Shih Hung
AU - Chen, Kuan Yu
AU - Hsieh, Yu Lun
AU - Chen, Berlin
AU - Wang, Hsin Min
AU - Yen, Hsu Chun
AU - Hsu, Wen Lian
N1 - Publisher Copyright:
Copyright ©2016 ISCA.
PY - 2016
Y1 - 2016
N2 - Extractive summarization is a process that manages to select the most salient sentences from a document (or a set of documents) and subsequently assemble them to form an informative summary, facilitating users to browse and assimilate the main theme of the document efficiently. Our work in this paper continues this general line of research and its main contributions are two-fold. First, we explore to leverage the recently proposed word mover's distance (WMD) metric, in conjunction with semantic-aware continuous space representations of words, to authentically capture finer-grained sentence-to-document and/or sentence-to-sentence semantic relatedness for effective use in the summarization process. Second, we investigate to combine our proposed approach with several state-of-the-art summarization methods, which originally adopted the conventional term-overlap or bag-ofwords (BOW) approaches for similarity calculation. A series of experiments conducted on a typical broadcast news summarization task seem to suggest the performance merits of our proposed approach, in comparison to the mainstream methods.
AB - Extractive summarization is a process that manages to select the most salient sentences from a document (or a set of documents) and subsequently assemble them to form an informative summary, facilitating users to browse and assimilate the main theme of the document efficiently. Our work in this paper continues this general line of research and its main contributions are two-fold. First, we explore to leverage the recently proposed word mover's distance (WMD) metric, in conjunction with semantic-aware continuous space representations of words, to authentically capture finer-grained sentence-to-document and/or sentence-to-sentence semantic relatedness for effective use in the summarization process. Second, we investigate to combine our proposed approach with several state-of-the-art summarization methods, which originally adopted the conventional term-overlap or bag-ofwords (BOW) approaches for similarity calculation. A series of experiments conducted on a typical broadcast news summarization task seem to suggest the performance merits of our proposed approach, in comparison to the mainstream methods.
KW - Extractive summarization
KW - Markov random walk
KW - Word mover's distance
KW - Word representation
UR - http://www.scopus.com/inward/record.url?scp=84994242295&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994242295&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2016-710
DO - 10.21437/Interspeech.2016-710
M3 - Conference article
AN - SCOPUS:84994242295
SN - 2308-457X
VL - 08-12-September-2016
SP - 670
EP - 674
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
Y2 - 8 September 2016 through 16 September 2016
ER -