I know some similar questions have already been asked, but I think they are asking for simulating touch in their own applications, however I want to make an agent that can \"use
Have you tried Selendroid? http://selendroid.io/
I haven't tried it myself, I only know of it because I use Selenium for Web Applications.
Selenium is able to simulate a series of input events. For this purpose, one can either work with coordinates or DOM elements (divs, buttons, textfields etc).
Your usecase should be exactly what selendroid was made for.