TY - JOUR
T1 - iSpreadRank
T2 - Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network
AU - Yeh, Jen Yuan
AU - Ke, Hao Ren
AU - Yang, Wei Pang
N1 - Funding Information:
This work was supported by the National Science Council (Grant Number: NSC-92-2213-E-009-126). Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors only, and do not necessarily reflect the viewpoints of the National Science Council.
PY - 2008/10
Y1 - 2008/10
N2 - Sentence extraction is a widely adopted text summarization technique where the most important sentences are extracted from document(s) and presented as a summary. The first step towards sentence extraction is to rank sentences in order of importance as in the summary. This paper proposes a novel graph-based ranking method, iSpreadRank, to perform this task. iSpreadRank models a set of topic-related documents into a sentence similarity network. Based on such a network model, iSpreadRank exploits the spreading activation theory to formulate a general concept from social network analysis: the importance of a node in a network (i.e., a sentence in this paper) is determined not only by the number of nodes to which it connects, but also by the importance of its connected nodes. The algorithm recursively re-weights the importance of sentences by spreading their sentence-specific feature scores throughout the network to adjust the importance of other sentences. Consequently, a ranking of sentences indicating the relative importance of sentences is reasoned. This paper also develops an approach to produce a generic extractive summary according to the inferred sentence ranking. The proposed summarization method is evaluated using the DUC 2004 data set, and found to perform well. Experimental results show that the proposed method obtains a ROUGE-1 score of 0.38068, which represents a slight difference of 0.00156, when compared with the best participant in the DUC 2004 evaluation.
AB - Sentence extraction is a widely adopted text summarization technique where the most important sentences are extracted from document(s) and presented as a summary. The first step towards sentence extraction is to rank sentences in order of importance as in the summary. This paper proposes a novel graph-based ranking method, iSpreadRank, to perform this task. iSpreadRank models a set of topic-related documents into a sentence similarity network. Based on such a network model, iSpreadRank exploits the spreading activation theory to formulate a general concept from social network analysis: the importance of a node in a network (i.e., a sentence in this paper) is determined not only by the number of nodes to which it connects, but also by the importance of its connected nodes. The algorithm recursively re-weights the importance of sentences by spreading their sentence-specific feature scores throughout the network to adjust the importance of other sentences. Consequently, a ranking of sentences indicating the relative importance of sentences is reasoned. This paper also develops an approach to produce a generic extractive summary according to the inferred sentence ranking. The proposed summarization method is evaluated using the DUC 2004 data set, and found to perform well. Experimental results show that the proposed method obtains a ROUGE-1 score of 0.38068, which represents a slight difference of 0.00156, when compared with the best participant in the DUC 2004 evaluation.
KW - Feature weigh propagation
KW - Multidocument summarization
KW - Sentence extraction
KW - Sentence similarity network
KW - Social network analysis
KW - Spreading activation
UR - http://www.scopus.com/inward/record.url?scp=44949265290&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=44949265290&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2007.08.037
DO - 10.1016/j.eswa.2007.08.037
M3 - Article
AN - SCOPUS:44949265290
SN - 0957-4174
VL - 35
SP - 1451
EP - 1462
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 3
ER -