Try Sikuli
Sikuli is a technology born in MIT UI Design group that allows to automate computer operations using computer vision. Computer vision recognizes patterns from screenshots of graphical user interfaces (GUI) and Jython script language is used to take actions on them. Language includes graphical elements and is best edited with IDE that comes with the software. Sikuli can be used for automated software testing, like Selenium is used to control a web page, it can control pretty much any interface that it can recognize and click or enter text - PC (Windows, Mac OS X and Linux) applications, and even an iPhone or Android application running in a simulator or via VNC.
You can also use Sikuli Java API
Sikuli API for Java provides image-based GUI automation functionalities to Java programmers. It is created and will be actively maintained by Sikuli Lab. This new Java library has a re-designed API and includes several new functions that were not available in the original Sikuli Script, such as the abilities to match colors, handle events, and find geometric patterns such as rectangular buttons. Moreover, it has a greatly simplified build process based on Maven.