A very simple approach is the following:
Convert the image to greyscale in memory, so every pixel is only a number between 0 (black) and 255 (white).
Scale the image to a fixed size. Finding the right size is important, you should play around with different sizes. E.g. you could scale each image to 64x64 pixels, but you may get better or worse results with either smaller or bigger pictures.
Once you've done this for all images (yes, that will take a while), always load two images in memory and subtract them from each other. That is subtract the value of pixel (0,0) in image A ob the value of pixel (0,0) in image B, now do the same for (0,1) in both and so on. The resulting value might be positive or negative, you should always store the absolute value (so 5 results in 5, -8 however results in 8).
Now you have a third image being the "difference image" (delta image) of image A and B. If they were identical, the delta image is all black (all values will subtract to zero). The "less black" it is, the less identical the images are. You need to find a good threshold, since even if the images are in fact identical (to your eyes), by scaling, altering brightness and so on, the delta image will not be totally black, it will however have only very dark greytones. So you need a threshold that says "If average error (delta image brightness) is below a certain value, there is still a good chance they might be identical, however if it is above that value, they are most likely not. Finding the right threshold is as hard as finding the right scaling size. You will always have false positives (images deemed to be identical, though they are not at all) and false negatives (images deemed to be not identical, although they are).
This algorithm is ultra slow. Actually only creating the greyscale images takes tons of time. Then you need to compare each GS image to each other one, again, tons of time. Also storing all the GS images takes a lot of disk space. So this algorithm is very bad, but the results are not that bad, even though its that simple. While the results are not amazing, they are better than I had initially thought.
The only way to get even better results is to use advanced image processing and here it starts getting really complicated. It involves a lot of math (a real lot of it); there are good applications (dupe finders) for many systems that have these implemented, so unless you must program it yourself, you are probably better off using one of these solutions. I read a lot papers on this topic but I'm afraid most of this goes beyond my horizon. Even the algorithms I might be able to implement according to these papers are beyond it; that means I understand what needs to be done, but I have no idea why it works or how it actually works, it's just magic ;-)