问题
I want to compute the clock cycle count for a specific function inside my c code which is going to be compiled and run on BeagleBone Black. I have no idea how I can do this. I searched over the web and I found this instruction:
Clock Read method on Arndale board:
Step-1: Inserting kernel module to enable user space access to PMU counters.
Untar the attached file “arndale_clockread.tar.bz2” which is having Makefile and enableccnt.c. In Makefile change the “KERNELDIR” with your kernel source directory e.g. /usr/src/linux-kernel-version
then run the command.
linaro@linaro-server:~/enableccnt$ make
The above command should give output as enableccnt.ko
, which is kernel module to enable userspace access to PMU counters. Then run the command.
linaro@linaro-server:~/enableccnt$ sudo insmod enableccnt.ko
The following command should show enableccnt module is being inserted in the running kernel.
linaro@linaro-server:~/enableccnt$ lsmod
Step-2: Reading the counter from user space applications. Once the kernel module is being setup. Following function can be used to read the counter
static void readticks(unsigned int *result)
{
struct timeval t;
unsigned int cc;
if (!enabled) {
// program the performance-counter control-register:
asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(17));
//enable all counters.
asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(0x8000000f));
//clear overflow of coutners
asm volatile("mcr p15, 0, %0, c9, c12, 3" :: "r"(0x8000000f));
enabled = 1;
}
//read the counter value.
asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(cc));
gettimeofday(&t,(struct timezone *) 0);
result[0] = cc;
result[1] = t.tv_usec;
result[2] = t.tv_sec;
}
I believe this instruction should be working for any ARMv7
platforms. So, I followed the instruction and change the kernel source directory. This is how the Makefile looks like:
KERNELDIR := /usr/src/linux-headers-3.8.13-bone70
obj-m := enableccnt.o
CROSS=arm-linux-gnueabihf-
all:
CC=arm-cortex_a15-linux-gnueabihf-gcc $(MAKE) ARCH=arm -C $(KERNELDIR) M=`pwd` CROSS_COMPILE=$(CROSS) -I/lib/arm-linux-gnueabihf/lib
Now, when I run make
, I've got this error which is complaining about arm-linux-gnueabihf-ar
:
CC=arm-cortex_a08-linux-gnueabihf-gcc make ARCH=arm -C /usr/src/linux-headers-3.8.13-bone70 M=`pwd` CROSS_COMPILE=arm-linux-gnueabihf- -I/lib/arm-linux-gnueabihf/
make[1]: Entering directory `/usr/src/linux-headers-3.8.13-bone70'
LD /root/crypto_project/Arndale_enableccnt/built-in.o
/bin/sh: 1: arm-linux-gnueabihf-ar: not found
make[2]: *** [/root/crypto_project/Arndale_enableccnt/built-in.o] Error 127
make[1]: *** [_module_/root/crypto_project/Arndale_enableccnt] Error 2
make[1]: Leaving directory `/usr/src/linux-headers-3.8.13-bone70'
make: *** [all] Error 2
I tried to install arm-linux-gnueabihf-ar
but it doesn't work. So, I have no clue what should I do now!
EDIT1- As it is mentioned in comments, I add my toolchain path into my environment variable by using:
export PATH=/path/to/mytoolchain/bin:$PATH
And now I don't get previous error. However, I've got this syntax error which I think it relates to the kernel header files:
CC=arm-cortex_a15-linux-gnueabihf-gcc make ARCH=arm -C /usr/src/linux-headers-3.8.13-bone70 M=`pwd` CROSS_COMPILE=arm-linux-gnueabihf- -I/lib/arm-linux-gnueabihf/bin
/root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-gcc: 1: /root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-gcc: Syntax error: "(" unexpected
make[1]: Entering directory `/usr/src/linux-headers-3.8.13-bone70'
LD /root/crypto_project/Arndale_enableccnt/built-in.o
/root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-ar: 1: /root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-ar: Syntax error: "(" unexpected
make[2]: *** [/root/crypto_project/Arndale_enableccnt/built-in.o] Error 2
make[1]: *** [_module_/root/crypto_project/Arndale_enableccnt] Error 2
make[1]: Leaving directory `/usr/src/linux-headers-3.8.13-bone70'
make: *** [all] Error 2
The only reasonable solution that comes to my mind is to download kernel source code with its header files and try to make again. Does anyone have any idea to resolve this issue?
回答1:
As there can be many obstacles along the way, below is the complete guide how to build that kernel module and user-space application.
Toolchain
First of all, you need to download and install 2 toolchains:
- Toolchain for building kernel (and kernel modules): bare-metal (EABI) toolchain
- Toolchain for building user-space application: GNU/Linux toolchain
I recommend you to use Linaro ARM toolchains, as they are free, reliable and well optimized for ARM. Here you can choose desired toolchains (in "Linaro Toolchain" section). On BeagleBone Black you have little-endian architecture by default (like on most ARMv7 processors), so download next two archives:
- linaro-toolchain-binaries (little-endian) Bare Metal
- linaro-toolchain-binaries (little-endian) Linux
Once downloaded, extract those archives into /opt
directory.
Kernel sources
First of all, you need to find out which exactly kernel sources were used to build the kernel which flashed to your board. You can try to figure that out (by your board revision) from here. Or you can build your own kernel, flash it to your board, and now you know exactly which kernel version is in use.
Anyway, you need to download correct kernel sources (which correspond to kernel on your board). Those sources will be used further to build kernel module. If kernel version is incorrect, you will have "magic mismatch" error or something like that on module loading.
I will use stable kernel sources from kernel.org just for references (it should be sufficient at least to build the module).
Build kernel
Run next commands in your terminal to configure shell environment (bare-metal toolchain) for kernel building:
$ export PATH=/opt/gcc-linaro-5.1-2015.08-x86_64_arm-eabi/bin:$PATH
$ export CROSS_COMPILE=arm-eabi-
$ export ARCH=arm
Configure kernel using defconfig for your board (from arch/arm/configs/
). I will use omap2plus_defconfig
for example:
$ make omap2plus_defconfig
Now either build the whole kernel:
$ make -j4
or prepare needed kernel files for building external module:
$ make prepare
$ make modules_prepare
In second case the module will not have dependency list and probably you will need to use "force" option when loading it. So the preferred option is building the whole kernel.
Kernel module
NOTE: the code I'm gonna use further is from this answer.
First you need to enable ARM performance counter for user-space access (details are here). It can be done only in kernel-space. Here is the module code and Makefile
you can use to do so:
perfcnt_enable.c:
#include <linux/module.h>
static int __init perfcnt_enable_init(void)
{
/* Enable user-mode access to the performance counter */
asm ("mcr p15, 0, %0, C9, C14, 0\n\t" :: "r"(1));
/* Disable counter overflow interrupts (just in case) */
asm ("mcr p15, 0, %0, C9, C14, 2\n\t" :: "r"(0x8000000f));
pr_debug("### perfcnt_enable module is loaded\n");
return 0;
}
static void __exit perfcnt_enable_exit(void)
{
}
module_init(perfcnt_enable_init);
module_exit(perfcnt_enable_exit);
MODULE_AUTHOR("Sam Protsenko");
MODULE_DESCRIPTION("Module for enabling performance counter on ARMv7");
MODULE_LICENSE("GPL");
Makefile:
ifneq ($(KERNELRELEASE),)
# kbuild part of makefile
CFLAGS_perfcnt_enable.o := -DDEBUG
obj-m := perfcnt_enable.o
else
# normal makefile
KDIR ?= /lib/modules/$(shell uname -r)/build
module:
$(MAKE) -C $(KDIR) M=$(PWD) modules
clean:
$(MAKE) -C $(KDIR) M=$(PWD) clean
.PHONY: module clean
endif
Build kernel module
Using configured shell environment from previous step, let's export one more environment variable:
$ export KDIR=/path/to/your/kernel/sources/dir
Now just run:
$ make
The module is built (perfcnt_enable.ko
file).
User-space application
Once ARM performance counter is enabled in kernel-space (by kernel module), you can read its value in user-space application. Here is the example of such application.
perfcnt_test.c:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
static unsigned int get_cyclecount(void)
{
unsigned int value;
/* Read CCNT Register */
asm volatile ("mrc p15, 0, %0, c9, c13, 0\t\n": "=r"(value));
return value;
}
static void init_perfcounters(int32_t do_reset, int32_t enable_divider)
{
/* In general enable all counters (including cycle counter) */
int32_t value = 1;
/* Peform reset */
if (do_reset) {
value |= 2; /* reset all counters to zero */
value |= 4; /* reset cycle counter to zero */
}
if (enable_divider)
value |= 8; /* enable "by 64" divider for CCNT */
value |= 16;
/* Program the performance-counter control-register */
asm volatile ("mcr p15, 0, %0, c9, c12, 0\t\n" :: "r"(value));
/* Enable all counters */
asm volatile ("mcr p15, 0, %0, c9, c12, 1\t\n" :: "r"(0x8000000f));
/* Clear overflows */
asm volatile ("mcr p15, 0, %0, c9, c12, 3\t\n" :: "r"(0x8000000f));
}
int main(void)
{
unsigned int overhead;
unsigned int t;
/* Init counters */
init_perfcounters(1, 0);
/* Measure the counting overhead */
overhead = get_cyclecount();
overhead = get_cyclecount() - overhead;
/* Measure ticks for some operation */
t = get_cyclecount();
sleep(1);
t = get_cyclecount() - t;
printf("function took exactly %d cycles (including function call)\n",
t - overhead);
return EXIT_SUCCESS;
}
Makefile:
CC = gcc
APP = perfcnt_test
SOURCES = perfcnt_test.c
CFLAGS = -Wall -O2 -static
default:
$(CROSS_COMPILE)$(CC) $(CFLAGS) $(SOURCES) -o $(APP)
clean:
-rm -f $(APP)
.PHONY: default clean
Notice that I added -static
option just in case if you are using Android etc. If your distro has regular libc, you can probably remove that flag to reduce size of result binary.
Build user-space application
Prepare shell environment (Linux toolchain):
$ export PATH=/opt/gcc-linaro-5.1-2015.08-x86_64_arm-linux-gnueabihf/bin:$PATH
$ export CROSS_COMPILE=arm-linux-gnueabihf-
Build the application:
$ make
Output binary is perfcnt_test
.
Testing
- Upload both kernel module and user-space application to your board.
Load the module:
# insmod perfcnt_enable.ko
Run the application:
# ./perfcnt_test
来源:https://stackoverflow.com/questions/34081183/compute-clock-cycle-count-on-arm-cortex-a8-beaglebone-black