Measuring amount of CPU time taken by a piece of code, in C on Unix/Linux

后端未结

关注

 6  997

Can clock() be used as a dependable API to measure time taken by CPU to execute a snippet of code? When verified usng times() / clock(), both do not seem to measure the CPU

相关标签:

6条回答

离开以前

2021-01-05 13:46

Resource usage of a process/thread is updated by the OS only periodically. It's entirely possible for a code snippet to complete before the next update thus producing zero resource usage diffs. Can't say anything about HP or AIX, would refer you to Solaris Performance and Tools book for Sun. For Linux you want to look at oprofile and newer perf tool. On the profiling side valgrind would be of much help.

0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2021-01-05 13:50

I would give a try with getrusage and check system and user time.

Also check with gettimeofday to compare with wall clock time.

0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2021-01-05 13:56

I would try to correlate the time with the shell's time command, as a sanity check.

You should also consider that the compiler may be optimizing the loop. Since the memset does not depend on the loop variable the compiler will certainly be tempted to apply an optimization known as loop invariant code motion.

I would also caution that a 10MB possibly in-cache clear will really be 1.25 or 2.5 million CPU operations as memset certainly writes in 4-byte or 8-byte quantities. While I rather doubt that this could be done in less than a microsecond, as stores are a bit expensive and 100K adds some L1 cache pressure, you are talking about not much more than one operation per nanosecond, which is not that hard to sustain for a multi-GHz CPU.

One imagines that 600 nS would round off to 1 clock tick, but I would worry about that as well.

0 讨论(0)
发布评论:

提交评论
- 加载中...
长情又很酷

2021-01-05 14:01

you can use clock_t to get the number of CPU ticks since the program started.

Or you can use the linux time command. eg: time [program] [arguments]

0 讨论(0)
发布评论:

提交评论
- 加载中...
既然无缘

2021-01-05 14:07
On recent Linux's (*). you can get this information from the /proc filesystem. In the file /proc/PID/stat the 14th entry has the number of jiffies used in userland code and the 15th entry has the number of jiffies used in system code.

If you want to see the data on a per-thread basis, you should reference the file /proc/PID/task/TID/stat instead.

To convert jiffies to microseconds, you can use the following:
```
define USEC_PER_SEC         1000000UL

long long jiffies_to_microsecond(long long jiffies)
{
    long hz = sysconf(_SC_CLK_TCK);
    if (hz <= USEC_PER_SEC && !(USEC_PER_SEC % hz))
    {
        return (USEC_PER_SEC / hz) * jiffies;
    }
    else if (hz > USEC_PER_SEC && !(hz % USEC_PER_SEC))
    {
        return (jiffies + (hz / USEC_PER_SEC) - 1) / (hz / USEC_PER_SEC);
    }
    else
    {
        return (jiffies * USEC_PER_SEC) / hz;
    }
}
```
If all you care about is the per-process statistics, getrusage is easier. But if you want to be prepared to do this on a per-thread basis, this technique is better as other then the file name, the code would be identical for getting the data per-process or per-thread.

* - I'm not sure exactly when the stat file was introduced. You will need to verify your system has it.
0 讨论(0)
发布评论:

提交评论
- 加载中...

遥遥无期

2021-01-05 14:07

Some info here on HP's page about high resolution timers. Also, same trick _Asm_mov_from_ar (_AREG_ITC); used in http://www.fftw.org/cycle.h too.

Have to confirm if this can really be the solution.

Sample prog, as tested on HP-UX 11.31:

bbb@m_001/tmp/prof > ./perf_ticks 1024
ticks-memset {func [1401.000000] inline [30.000000]} noop [9.000000]
bbb@m_001/tmp/prof > cat perf_ticks.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include "cycle.h" /* one from http://www.fftw.org/cycle.h */
void test_ticks(char* sbuf, int* len){
    memset((char*)sbuf,0,*len);
}
int main(int argc,char* argv[]){
        int len=atoi(argv[1]);
        char *sbuf=(char*)malloc(len);
        ticks t1,t2,t3,t4,t5,t6;
        t1 =getticks(); test_ticks(sbuf,&len); t2 =getticks();
        t3 =getticks(); memset((char*)sbuf,0,len); t4 =getticks();
        t5=getticks();;t6=getticks();
        printf("ticks-memset {func [%llf] inline [%llf]} noop [%llf]\n",
                          elapsed(t2,t1),elapsed(t4,t3),elapsed(t6,t5));
        free(sbuf); return 0;
}
bbb@m_001/tmp/prof >

0 讨论(0)