valgrind, profiling timer expired?

前端 未结 3 625
误落风尘
误落风尘 2021-01-11 11:10

I try to profile a simple c prog using valgrind:

[zsun@nel6005001 ~]$ valgrind --tool=memcheck ./fl.out
==2238== Memcheck, a memory error dete

相关标签:
3条回答
  • 2021-01-11 11:33

    The problem is that you are using valgrind on a program compiled with -pg. You cannot use valgrind and gprof together. The valgrind manual suggests using OProfile if you are on Linux and need to profile the actual emulation of the program under valgrind.

    0 讨论(0)
  • 2021-01-11 11:37

    You are not going to be able to compute 10000! like that. You will need some sort of bignum implementation for computing factorials. This is because int is "usually" 4 bytes long which means that "usually" it can hold 2^32 - 1 (signed int, 2^31) - 13! is more than that. Even if you used an unsigned long ("usually" 8 bytes) you'd overflow by the time you reached 21!.

    As for what it "profiling timer expired" means - it means valgrind received the signal SIGPROF: http://en.wikipedia.org/wiki/SIGPROF (probably means your program took too long).

    0 讨论(0)
  • 2021-01-11 11:49

    By the way, this isn't computing factorial.

    If you're really trying to find out where the time goes, you could try stackshots. I put an infinite loop around your code and took 10 of them. Here's the code:

     6: void forloop(void){ 
     7:   int fac=1; 
     8:   int count=5; 
     9:   int i,k; 
    10:
    11:   for (i = 1; i <= count; i++){ 
    12:       for(k=1;k<=count;k++){ 
    13:           fac = fac * i; 
    14:       } 
    15:   } 
    16: } 
    17:
    18: int main(int argc, char* argv[])
    19: {
    20: int i;
    21: for (;;){
    22:     forloop();
    23: }
    24: return 0;
    25: }
    

    And here are the stackshots, re-ordered with the most frequent at the top:

    forloop() line 12
    main() line 23
    
    forloop() line 12 + 21 bytes
    main() line 23
    
    forloop() line 12 + 21 bytes
    main() line 23
    
    forloop() line 12 + 9 bytes
    main() line 23
    
    forloop() line 13 + 7 bytes
    main() line 23
    
    forloop() line 13 + 3 bytes
    main() line 23
    
    forloop() line 6 + 22 bytes
    main() line 23
    
    forloop() line 14
    main() line 23
    
    forloop() line 7
    main() line 23
    
    forloop() line 11 + 9 bytes
    main() line 23
    

    What does this tell you? It says that line 12 consumes about 40% of the time, and line 13 consumes about 20% of the time. It also tells you that line 23 consumes nearly 100% of the time.

    That means unrolling the loop at line 12 might potentially give you a speedup factor of 100/(100-40) = 100/60 = 1.67x approximately. Of course there are other ways to speed up this code as well, such as by eliminating the inner loop, if you're really trying to compute factorial.

    I'm just pointing this out because it's a bone-simple way to do profiling.

    0 讨论(0)
提交回复
热议问题