问题
I have a program written in Ruby and C. The C portion is a shared library, which is an extension for the Ruby program. I want to profile the C shared library I wrote, using gprof. I compile the shared library like this:
gcc -I. -I/usr/lib/ruby/1.8/i486-linux -I/usr/lib/ruby/1.8/i486-linux -I. -D_FILE_OFFSET_BITS=64 -fPIC -fno-strict-aliasing -g -march=i686 -O2 -ggdb -pg -fPIC -c extension.c
gcc -shared -o extension.so extension.o -L. -L/usr/lib -L. -Wl,-Bsymbolic-functions -rdynamic -Wl,-export-dynamic -lruby1.8 -lpthread -lrt -ldl -lcrypt -lm -lc
Then I execute the ruby program, which loads this shared library, and I expect a gmon.out file in the current directory, but for some reason the file gmon.out does not get created. How do I do this?
I googled for this but couldn't find a satisfactory answer (which worked).
P.S. - As a workaround I can have a modified version of the extension which is a pure C program (instead of being created as a shared library) that I can use to profile, but it becomes tedious to maintain two versions of the same C extension (large number of differences between the two).
I tried writing a C program which uses the shared library directly too. I immediately get a page fault in one of the ruby library functions which get called during the initialization of the shared library. I think it's really expecting to be loaded from a ruby program, which may internally be doing some magic.
(gdb) bt
#0 0x0091556e in st_lookup () from /usr/lib/libruby1.8.so.1.8
#1 0x008e87c2 in rb_intern () from /usr/lib/libruby1.8.so.1.8
#2 0x008984a5 in rb_define_module () from /usr/lib/libruby1.8.so.1.8
#3 0x08048dd0 in Init_SimilarStr () at extension.c:542
#4 0x0804933e in main (argc=2, argv=0xbffff454) at extension.c:564
Update: Never mind. I used #ifdef to compile out Ruby portions of the extension and get a profile. Closing.
回答1:
I found oprofile to be a lot better option than gprof for profiling, in this situation. The reports from oprofile are much more comprehensive. I compiled out the ruby portions which were causing the seg-faults (not all of them were), from the C extension using #ifndef PROFILE, and replaced them with non-ruby code. I wrote a main() routine within the extension itself, to call the functions in the extension. Then I set up a makefile to compile the extension as a C program with PROFILE defined. Then I installed oprofile on Ubuntu. Wrote this script.
#!/bin/bash
sudo opcontrol --reset
sudo opcontrol --start
./a.out Rome Damascus NewYork Delhi Bangalore
sudo opcontrol --shutdown
opreport -lt1
Compiled my program, and executed the above script, which gives output like this from the "opreport" command:
...
...
Killing daemon.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:10675 range:0x920000-0x921000) could not be found.
warning: [vdso] (tgid:1270 range:0xba1000-0xba2000) could not be found.
warning: [vdso] (tgid:1675 range:0x973000-0x974000) could not be found.
warning: [vdso] (tgid:1711 range:0x264000-0x265000) could not be found.
warning: [vdso] (tgid:1737 range:0x990000-0x991000) could not be found.
warning: [vdso] (tgid:2477 range:0xa53000-0xa54000) could not be found.
warning: [vdso] (tgid:5658 range:0x7ae000-0x7af000) could not be found.
CPU: Core Solo / Duo, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples % app name symbol name
12731 32.8949 a.out levenshtein
11958 30.8976 a.out corpora_pass2
5231 13.5161 no-vmlinux /no-vmlinux
4021 10.3896 a.out corpora_pass1
1733 4.4778 libc-2.10.1.so /lib/tls/i686/cmov/libc-2.10.1.so
542 1.4004 ld-2.10.1.so /lib/ld-2.10.1.so
398 1.0284 a.out method_top_matches
There it is: the top consumer is the function levenshtein(). I followed this by another command to generate disassembled output annotated with source code and execution count/time of each line. This looks like this (counts/times are on the left of each executed line):
> opannotate --source --assembly ./a.out > report.as.handcoded.1
> cat report.as.handcoded.1
...
...
...
: __asm__ (
2 0.0069 : 804918a: mov -0x50(%ebp),%ecx
4 0.0137 : 804918d: mov -0x54(%ebp),%ebx
: 8049190: mov -0x4c(%ebp),%eax
12 0.0412 : 8049193: cmp %eax,%ecx
10 0.0344 : 8049195: cmovbe %ecx,%eax
8 0.0275 : 8049198: cmp %eax,%ebx
11 0.0378 : 804919a: cmovbe %ebx,%eax
16 0.0550 : 804919d: mov %eax,-0x4c(%ebp)
: "cmp %0, %2\n\t"
: "cmovbe %2, %0\n\t"
: : "+r"(a) :
: "%r"(b), "r"(c)
: );
: return a;
...
...
...
回答2:
You can do better than gprof. Consider stackshots. You can do it using pstack, lsstack (if you can get it), or by manually pausing under the debugger. Here's a short intro to the technique.
回答3:
You can run the ruby interpreter itself through the profiler. If that is too much, write a small C program that loads the shared library and calls its exported functions. Then profile that C program. It saves you from maintaining two versions of the library.
来源:https://stackoverflow.com/questions/2047211/profile-a-c-shared-library-called-by-ruby-program