I have about 5000 images with water marks on them and 5000 identical images with no watermarks. The file names of each set of images are not correlated to each other in any way.
You can use the OpenCV library. It can be used in Java. Please follow http://docs.opencv.org/doc/tutorials/introduction/desktop_java/java_dev_intro.html
Regarding image compare, you can see another useful answer here: Checking images for similarity with OpenCV