Identify images with same content in Java

后端 未结 2 1781
遥遥无期
遥遥无期 2021-02-04 13:14

A while ago, I spent some time searching for ways to determine whether two images are identical in order to answer this question. I now face a slightly different problem: I have

相关标签:
2条回答
  • 2021-02-04 13:30

    I think that the general answer to this question calls for an unsupervised machine learning approach that generates local invariant features - basically, a fancy way of finding hashes that don't change with scaling or rotation - and then running a clustering algorithm. Here are some papers that might be relevant:

    • Clustering Near-Duplicate Images in Large Collections
    • A Novel Duplicate Images Detection Method Based on PLSA Model
    • Efficient image duplicate detection based on image analysis - Tons of stuff in here, since it's some dude's entire PhD thesis
    0 讨论(0)
  • 2021-02-04 13:35

    Well i think dHash is something you need for this. You just have to improve dHash to take into consideration of rotation, that means 2000 images will be considered as 8000 images.

    I wrote a pure java library just for this few days back. You can feed it with directory path(includes sub-directory), and it will list the duplicate images in list with absolute path which you want to delete. Alternatively, you can use it to find all unique images in a directory too.

    It used awt api internally, so can't be used for Android though. Since, imageIO has problem reading alot of new types of images, i am using twelve monkeys jar which is internally used.

    https://github.com/srch07/Duplicate-Image-Finder-API

    Jar with dependencies bundled internally can be downloaded from, https://github.com/srch07/Duplicate-Image-Finder-API/blob/master/archives/duplicate_image_finder_1.0.jar

    The api can find duplicates among images of different sizes too.

    0 讨论(0)
提交回复
热议问题