Metric for finding similar images in a database

前端 未结 1 714
渐次进展
渐次进展 2021-02-04 17:56

There\'s a lot of different algorithms for computing the similarity between two images, but I can\'t find anything on how you would store this information in a database such tha

相关标签:
1条回答
  • 2021-02-04 18:18

    There's one little confusing thing in your question: the "fingerprint" you linked to is explicitly not meant to find similar images (quote):

    TinEye does not typically find similar images (i.e. a different image with the same subject matter); it finds exact matches including those that have been cropped, edited or resized.

    Now, that said, I'm just going to assume you know what you are asking, and that you actually want to be able to find all similar images, not just edited exact copies.

    If you want to try and get into it in detail, I would suggest looking up papers by Sivic, Zisserman and Nister, Stewenius. The idea these two papers (as well as quite a bit of others lately) have been using is to try and apply text-searching techniques to image databases, and search the image database in a same manner Google would search it's document (web-page) database.

    The first paper I have linked to is a good starting point for this kind of approach, since it addresses mainly the big question: What are the "words" in the images?. Text searching techniques all focus on words, and base their similarity measures on calculations including word counts. Successful representation of images as collections of visual words is thus the first step to applying text-searching techniques to image databases.

    The second paper then expands on the idea of using text-techniques, presenting a more suitable search structure. With this, they allow for a faster image retrieval and larger image databases. They also propose how to construct an image descriptor based on the underlying search structure.

    The features used as visual words in both papers should satisfy your invariance constraints, and the second one definitely should be able to work with your required database size (maybe even the approach from the 1st paper would work).

    Finally, I recommend looking up newer papers from the same authors (I'm positive Nister did something new, it's just that the approach from the linked paper has been enough for me until now), looking up some of their references and just generally searching for papers concerning Content based image (indexing and) retrieval (CBIR) - it is a very popular subject right now, so there should be plenty.

    0 讨论(0)
提交回复
热议问题