I want to profile my code on arm 9, Is there any profiler which can give me function call timings and total cycles taken by each function? I would prefer any free profiler.
If you've got some way to interrupt the code, this is free and surprisingly effective.
gprof would be the obvious choice if you are using gcc and I suppose valgrind should work too. Caveat: I am not familiar with kcachegrind
I see now that kcachegrind IS using valgrind framework, so I would imagine you would be able to run it from your development machine..
I don't know any free ARM profilers.
You can try ARM RVDS 4.0 Pro. It has a good profiler. And you can use emulator instead of real hardware with it. It simplifies some things, but you'll not receive information about cache-misses/memory-latency, and results may differ from tests on real hardware.
The price of RVDS is high enough. You can try trial for 30 or 45 days, maybe this will be enough to profile all you want.