With so many implementations available, what is the fastest executing (least CPU intensive, smallest binary), cross-platform (Linux, Mac, Windows, iPhone) A* implementation for
If your domain is restricted to a grid, maybe you will find better results by searching "pathfinding" rather the more generic A*.
If your domain is not strictly searching paths along a surface, you could get more benefits for your effort if you spend your time improving your heuristics rather than trying to optimise the algorithm itself.