TY - GEN
T1 - Video copy detection by fast sequence matching
AU - Yeh, Mei Chen
AU - Cheng, Kwang Ting
PY - 2009
Y1 - 2009
N2 - Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-scale applications. In this paper we view video copy detection as a local alignment problem between two frame sequences and propose a two-level filtration approach which achieves significant acceleration to the matching process. First, we propose to use an adaptive vocabulary tree to index all frame descriptors extracted from the video database. In this step, each video is treated as a "bag of frames." Such an indexing structure not only provides a rich vocabulary for representing videos, but also enables efficient computation of a pyramid matching kernel between videos. This vocabulary tree filters those videos that are dissimilar to the query based on their histogram pyramid representations. Second, we propose a fast edit-distance-based sequence matching method that avoids unnecessary comparisons between dissimilar frame pairs. This step reduces the quadratic runtime to a linear time with respect to the lengths of the sequences under comparison. Experiments on the MUSCLE VCD benchmark demonstrate that our approach is effective and efficient. It is 18X faster than the original sequence matching algorithms. This technique can be applied to several other visual retrieval tasks including shape retrieval. We demonstrate that the proposed method can also achieve a significant speedup for the shape retrieval task on the MPEG-7 shape dataset.
AB - Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-scale applications. In this paper we view video copy detection as a local alignment problem between two frame sequences and propose a two-level filtration approach which achieves significant acceleration to the matching process. First, we propose to use an adaptive vocabulary tree to index all frame descriptors extracted from the video database. In this step, each video is treated as a "bag of frames." Such an indexing structure not only provides a rich vocabulary for representing videos, but also enables efficient computation of a pyramid matching kernel between videos. This vocabulary tree filters those videos that are dissimilar to the query based on their histogram pyramid representations. Second, we propose a fast edit-distance-based sequence matching method that avoids unnecessary comparisons between dissimilar frame pairs. This step reduces the quadratic runtime to a linear time with respect to the lengths of the sequences under comparison. Experiments on the MUSCLE VCD benchmark demonstrate that our approach is effective and efficient. It is 18X faster than the original sequence matching algorithms. This technique can be applied to several other visual retrieval tasks including shape retrieval. We demonstrate that the proposed method can also achieve a significant speedup for the shape retrieval task on the MPEG-7 shape dataset.
KW - Local alignment
KW - Similarity measure
KW - Video copy detection
KW - Vocabulary tree
UR - http://www.scopus.com/inward/record.url?scp=74049160269&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74049160269&partnerID=8YFLogxK
U2 - 10.1145/1646396.1646449
DO - 10.1145/1646396.1646449
M3 - Conference contribution
AN - SCOPUS:74049160269
SN - 9781605584805
T3 - CIVR 2009 - Proceedings of the ACM International Conference on Image and Video Retrieval
SP - 344
EP - 350
BT - CIVR 2009 - Proceedings of the ACM International Conference on Image and Video Retrieval
T2 - ACM International Conference on Image and Video Retrieval, CIVR 2009
Y2 - 8 July 2009 through 10 July 2009
ER -