Image comparison - fast algorithm

后端 未结 9 1514
谎友^
谎友^ 2020-11-21 15:02

I\'m looking to create a base table of images and then compare any new images against that to determine if the new image is an exact (or close) duplicate of the base.

<
9条回答
  •  礼貌的吻别
    2020-11-21 15:30

    My company has about 24million images come in from manufacturers every month. I was looking for a fast solution to ensure that the images we upload to our catalog are new images.

    I want to say that I have searched the internet far and wide to attempt to find an ideal solution. I even developed my own edge detection algorithm.
    I have evaluated speed and accuracy of multiple models. My images, which have white backgrounds, work extremely well with phashing. Like redcalx said, I recommend phash or ahash. DO NOT use MD5 Hashing or anyother cryptographic hashes. Unless, you want only EXACT image matches. Any resizing or manipulation that occurs between images will yield a different hash.

    For phash/ahash, Check this out: imagehash

    I wanted to extend *redcalx'*s post by posting my code and my accuracy.

    What I do:

    from PIL import Image
    from PIL import ImageFilter
    import imagehash
    
    img1=Image.open(r"C:\yourlocation")
    img2=Image.open(r"C:\yourlocation")
    if img1.width

    Here are some of my results:

    item1  item2  totalsimilarity
    desk1  desk1       3
    desk1  phone1     22
    chair1 desk1      17
    phone1 chair1     34
    

    Hope this helps!

提交回复
热议问题