So like in Linux on Intel processor, we have a large amount of hardware performance counters to access. Like previously, using a user-space software called perfmon2, I could get
In addition to Benny's answer, I'll add the following:
The linux-tools-perf project is integrated with the AOSP (I'm using 4.2). It is under $AOSP/external/linux-tools-perf. The ./linux-tools-perf/Documentation contains a lot of info on how to run everything. The perf tool is built in $PRODUCT_OUT/system/bin/perf as part of the AOSP but not packaged; you need to explicitly push it to the target.
Check the Android (linux) kernel to confirm it supports PM: PERF_EVENTS. Google how to get .config but easiest is running "extract-ikconfig zImage".
One needs to have root access on the target and then run something like "./perf stat -- testprog1" to see results. There are a lot of arguments to the command to record/retrieve different PM counters.
Remember most ARM implementations have multiple cores and A LOT of pipelining. Cortex-A9 has out-of-order execution. So the numbers can be a little freaky - almost meaningless sometimes.
I use the ARM PMCCNTR register, the ARM PM Cycle Counter, occasionally to test code performance. PMUSERENR.EN and PMINTENCLR.C must be set at PL1+ level and then PMCCNTR can be managed at PL0 level. See the ARM ARM for what all this means and the perf subsystem for example usage!
Note: All the shell vars are set up in $AOSP/build/envsetup.sh