Tracing/profiling instructions

前端 未结 5 1822
渐次进展
渐次进展 2020-12-28 20:46

I\'d like to statistically profile my C code at the instruction level. I need to know how many additions, multiplications, divisions, etc I\'m performing.

This is no

5条回答
  •  隐瞒了意图╮
    2020-12-28 21:15

    QEMU user mode -d in_asm

    This is another simple thing you can do to get an instruction trace:

    sudo apt-get install qemu-user
    qemu-x86_64 -d in_asm main.out
    

    Let's test it with an x86_64 triple hello world:

    main.S

    .text
    .global _start
    _start:
    asm_main_after_prologue:
        mov $3, %rbx
    write:
        mov $1, %rax    /* syscall number */
        mov $1, %rdi    /* stdout */
        mov $msg, %rsi  /* buffer */
        mov $len, %rdx  /* len */
        syscall
        dec %rbx
        jne write
    exit:
        mov $60, %rax   /* syscall number */
        mov $0, %rdi    /* exit status */
        syscall
    msg:
        .ascii "hello\n"
    len = . - msg
    

    adapted from GitHub upstream.

    Assemble and run with:

    as -o main.o main.S 
    ld -o main.out main.o
    ./main.out
    

    Stdout output:

    hello
    hello
    hello
    

    Running it through QEMU outputs the instruction trace to stderr:

    warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
    host mmap_min_addr=0x10000
    Reserved 0x1000 bytes of guest address space
    Relocating guest address space from 0x0000000000400000 to 0x400000
    guest_base  0x0
    start            end              size             prot
    0000000000400000-0000000000401000 0000000000001000 r-x
    0000004000000000-0000004000001000 0000000000001000 ---
    0000004000001000-0000004000801000 0000000000800000 rw-
    start_brk   0x0000000000000000
    end_code    0x00000000004000b8
    start_code  0x0000000000400000
    start_data  0x00000000004000b8
    end_data    0x00000000004000b8
    start_stack 0x00000040007fed70
    brk         0x00000000004000b8
    entry       0x0000000000400078
    ----------------
    IN: 
    0x0000000000400078:  mov    $0x3,%rbx
    0x000000000040007f:  mov    $0x1,%rax
    0x0000000000400086:  mov    $0x1,%rdi
    0x000000000040008d:  mov    $0x4000b2,%rsi
    0x0000000000400094:  mov    $0x6,%rdx
    0x000000000040009b:  syscall 
    
    ----------------
    IN: 
    0x000000000040009d:  dec    %rbx
    0x00000000004000a0:  jne    0x40007f
    
    ----------------
    IN: 
    0x000000000040007f:  mov    $0x1,%rax
    0x0000000000400086:  mov    $0x1,%rdi
    0x000000000040008d:  mov    $0x4000b2,%rsi
    0x0000000000400094:  mov    $0x6,%rdx
    0x000000000040009b:  syscall 
    
    ----------------
    IN: 
    0x00000000004000a2:  mov    $0x3c,%rax
    0x00000000004000a9:  mov    $0x0,%rdi
    0x00000000004000b0:  syscall 
    

    I expect this method to be relatively fast. It works by reading input instructions and producing output instructions that the host can run, much like cachegrind which was mentioned at: https://stackoverflow.com/a/2971979/895245

    One cool thing about this is that you can also trivially trace executables of other architectures, see for example aarch64: How does native android code written for ARM run on x86?

    This method also displays the current symbol of unstripped executables, e.g. the trace of:

    main.c

    #include 
    
    int say_hello() {
        puts("hello");
    }
    
    int main(void) {
        say_hello();
    }
    

    compile and run:

    gcc -ggdb3 -O0 -o main.out main.c
    qemu-x86_64 -d in_asm ./main.out
    

    contains:

    ----------------
    IN: main
    0x0000000000400537:  push   %rbp
    0x0000000000400538:  mov    %rsp,%rbp
    0x000000000040053b:  mov    $0x0,%eax
    0x0000000000400540:  callq  0x400526
    
    ----------------
    IN: say_hello
    0x0000000000400526:  push   %rbp
    0x0000000000400527:  mov    %rsp,%rbp
    0x000000000040052a:  mov    $0x4005d4,%edi
    0x000000000040052f:  callq  0x400400
    
    ----------------
    IN: 
    0x0000000000400400:  jmpq   *0x200c12(%rip)        # 0x601018
    

    It does not however show symbols within shared libraries such as puts.

    But you can see them if you compile with -static:

    ----------------
    IN: main
    0x00000000004009bf:  push   %rbp
    0x00000000004009c0:  mov    %rsp,%rbp
    0x00000000004009c3:  mov    $0x0,%eax
    0x00000000004009c8:  callq  0x4009ae
    
    ----------------
    IN: say_hello
    0x00000000004009ae:  push   %rbp
    0x00000000004009af:  mov    %rsp,%rbp
    0x00000000004009b2:  mov    $0x4a1064,%edi
    0x00000000004009b7:  callq  0x40faa0
    
    ----------------
    IN: puts
    0x000000000040faa0:  push   %r12
    0x000000000040faa2:  push   %rbp
    0x000000000040faa3:  mov    %rdi,%r12
    0x000000000040faa6:  push   %rbx
    0x000000000040faa7:  callq  0x423830
    

    Related: https://unix.stackexchange.com/questions/147343/how-to-determine-what-instructions-a-process-is-executing

    Tested in Ubuntu 16.04, QEMU 2.5.0.

提交回复
热议问题