摘要
We present an approach to represent, match, and index various types of visual data, with the primary goal of enabling effective and computationally efficient searches. In this approach, an image/video is represented by an ordered list of feature descriptors. Similarities between such representations are then measured by the approximate string matching technique. This approach unifies visual appearance and the ordering information in a holistic manner with joint consideration of visual-order consistency between the query and the reference instances, and can be used for automatically identifying local alignments between two pieces of visual data. This capability is essential for tasks such as video copy detection where only small portions of the query and the reference videos are similar. To deal with large volumes of data, we further show that this approach can be significantly accelerated along with a dedicated indexing structure. Extensive experiments on various visual retrieval and classification tasks demonstrate the superior performance of the proposed techniques compared to existing solutions.
原文 | 英語 |
---|---|
文章編號 | 5643930 |
頁(從 - 到) | 320-329 |
頁數 | 10 |
期刊 | IEEE Transactions on Multimedia |
卷 | 13 |
發行號 | 2 |
DOIs | |
出版狀態 | 已發佈 - 2011 4月 |
ASJC Scopus subject areas
- 訊號處理
- 媒體技術
- 電腦科學應用
- 電氣與電子工程