TY - GEN
T1 - Word topical mixture models for extractive spoken document summarization
AU - Chen, Berlin
AU - Chen, Yi Ting
PY - 2007
Y1 - 2007
N2 - This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the cooccurrence relationship between words of the language. Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document, and sentences were ranked and selected according to their likelihoods. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the other conventional summarization approaches. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of broadcast news via mobile devices.
AB - This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the cooccurrence relationship between words of the language. Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document, and sentences were ranked and selected according to their likelihoods. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the other conventional summarization approaches. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of broadcast news via mobile devices.
UR - http://www.scopus.com/inward/record.url?scp=46449134158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=46449134158&partnerID=8YFLogxK
U2 - 10.1109/icme.2007.4284584
DO - 10.1109/icme.2007.4284584
M3 - Conference contribution
AN - SCOPUS:46449134158
SN - 1424410177
SN - 9781424410170
T3 - Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
SP - 52
EP - 55
BT - Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
PB - IEEE Computer Society
T2 - IEEE International Conference onMultimedia and Expo, ICME 2007
Y2 - 2 July 2007 through 5 July 2007
ER -