I need to compare two images that are screenshots of a software. I want to check if the two images are identical, including the numbers and letters displayed in the images.
I'm maintaining a Python library called pyssim that uses the Structured Similarity (SSIM) method to compare two images.
It doesn't have python bindings, but the perceptualdiff program is also awesome at comparing two images - and quite fast.
I can't give a ready to use answer, but I will point you in (I think) the right direction. A simple way of comparing two images is by making a hash of their binary representations and then see if those hashes are the same. One problem with this is with the hash function you want to use and you must look for one that have low chances of collisions, and the other is that an image file probably has metadata attached to the original binary information, so you will have to look at how to cut off that metadata in order to compare the images only using their binary info. Also, I don't know for sure but probably the binary representation of an image encoded in jpg is different from an image encoded in png, so you should be aware of that.
There are following ways to do the proper comparison.
To get a measure of how similar two images are, you can calculate the root-mean-square (RMS) value of the difference between the images. If the images are exactly identical, this value is zero. The following function uses the difference function, and then calculates the RMS value from the histogram of the resulting image.
# Example: File: imagediff.py
import ImageChops
import math, operator
def rmsdiff(im1, im2):
"Calculate the root-mean-square difference between two images"
h = ImageChops.difference(im1, im2).histogram()
# calculate rms
return math.sqrt(reduce(operator.add,
map(lambda h, i: h*(i**2), h, range(256))
) / (float(im1.size[0]) * im1.size[1]))
The quickest way to determine if two images have exactly the same contents is to get the difference between the two images, and then calculate the bounding box of the non-zero regions in this image. If the images are identical, all pixels in the difference image are zero, and the bounding box function returns None.
import ImageChops
def equal(im1, im2):
return ImageChops.difference(im1, im2).getbbox() is None