TY - GEN
T1 - Effective pseudo-relevance feedback for spoken document retrieval
AU - Chen, Yi Wen
AU - Chen, Kuan Yu
AU - Wang, Hsin Min
AU - Chen, Berlin
PY - 2013/10/18
Y1 - 2013/10/18
N2 - With the exponential proliferation of multimedia associated with spoken documents, research on spoken document retrieval (SDR) has emerged and attracted much attention in the past two decades. Apart from much effort devoted to developing robust indexing and modeling techniques for representing spoken documents, a recent line of thought targets at the improvement of query modeling for better reflecting the user's information need. Pseudo-relevance feedback is by far the most commonly-used paradigm for query reformulation, which assumes that a small amount of top-ranked feedback documents obtained from the initial round of retrieval are relevant and can be utilized for this purpose. Nevertheless, simply taking all of the top-ranked feedback documents obtained from the initial retrieval for query modeling (reformulation) does not always work well, especially when the top-ranked documents contain much redundant or non-relevant information. In the view of this, we explore in this paper an interesting problem of how to effectively glean useful cues from the top-ranked documents so as to achieve more accurate query modeling. To do this, different kinds of information cues are considered and integrated into the process of feedback document selection so as to improve query effectiveness. Experiments conducted on the TDT (Topic Detection and Tracking) task show the advantages of our retrieval methods for SDR.
AB - With the exponential proliferation of multimedia associated with spoken documents, research on spoken document retrieval (SDR) has emerged and attracted much attention in the past two decades. Apart from much effort devoted to developing robust indexing and modeling techniques for representing spoken documents, a recent line of thought targets at the improvement of query modeling for better reflecting the user's information need. Pseudo-relevance feedback is by far the most commonly-used paradigm for query reformulation, which assumes that a small amount of top-ranked feedback documents obtained from the initial round of retrieval are relevant and can be utilized for this purpose. Nevertheless, simply taking all of the top-ranked feedback documents obtained from the initial retrieval for query modeling (reformulation) does not always work well, especially when the top-ranked documents contain much redundant or non-relevant information. In the view of this, we explore in this paper an interesting problem of how to effectively glean useful cues from the top-ranked documents so as to achieve more accurate query modeling. To do this, different kinds of information cues are considered and integrated into the process of feedback document selection so as to improve query effectiveness. Experiments conducted on the TDT (Topic Detection and Tracking) task show the advantages of our retrieval methods for SDR.
KW - Kullback-Leibler (KL)-divergence
KW - Spoken document retrieval
KW - pseudo-relevance feedback
KW - query modeling
UR - http://www.scopus.com/inward/record.url?scp=84890461938&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890461938&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6639331
DO - 10.1109/ICASSP.2013.6639331
M3 - Conference contribution
AN - SCOPUS:84890461938
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 8535
EP - 8539
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -