With so many implementations available, what is the fastest executing (least CPU intensive, smallest binary), cross-platform (Linux, Mac, Windows, iPhone) A* implementation for
I suggest you implement the algorithm by yourself. Follow the pseudo code at: A* Search Algorithm and it should be straight forward. The "openset" should be implemented as a min-heap, which is also trivial; or you can use priority_queue from STL.