Java - Image Recognition

前端 未结 2 1570
孤独总比滥情好
孤独总比滥情好 2021-02-06 14:33

I have about 5000 images with water marks on them and 5000 identical images with no watermarks. The file names of each set of images are not correlated to each other in any way.

2条回答
  •  清酒与你
    2021-02-06 15:11

    I think this is more about performance then about the image comparison itself and the answer is written in such manner so if you need help with the comparison itself comment me ...

    1. create simplified histogram for each image

      let say 8 values per each channel limiting to 4 bits per each intensity level. That will lead to 3*8*4=3*32 bits per image

    2. sort images

      take above histogram and consider it as a single number and sort the images of A group by it does not matter if ascending or descending

    3. matching A and B grouped images

      now the corresponding images should have similar histograms so take image from unsorted group B (watermarked), bin-search all the closest match in A group (original) and then compare more with more robust methods just against selected images instead of 5000.

    4. add flag if image from A group is already matched

      so you can ignore already matched images in bullet #3 to gain more speed

    [Notes]

    there are other ways to improvement like use Perceptual hash algorithms

提交回复
热议问题