How to disassemble a binary executable in Linux to get the assembly code?

后端 未结 9 1290
情歌与酒
情歌与酒 2020-11-28 02:57

I was told to use a disassembler. Does gcc have anything built in? What is the easiest way to do this?

相关标签:
9条回答
  • 2020-11-28 03:15

    I don't think gcc has a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdump takes a -d/--disassemble flag:

    $ objdump -d /path/to/binary
    

    The disassembly looks like this:

    080483b4 <main>:
     80483b4:   8d 4c 24 04             lea    0x4(%esp),%ecx
     80483b8:   83 e4 f0                and    $0xfffffff0,%esp
     80483bb:   ff 71 fc                pushl  -0x4(%ecx)
     80483be:   55                      push   %ebp
     80483bf:   89 e5                   mov    %esp,%ebp
     80483c1:   51                      push   %ecx
     80483c2:   b8 00 00 00 00          mov    $0x0,%eax
     80483c7:   59                      pop    %ecx
     80483c8:   5d                      pop    %ebp
     80483c9:   8d 61 fc                lea    -0x4(%ecx),%esp
     80483cc:   c3                      ret    
     80483cd:   90                      nop
     80483ce:   90                      nop
     80483cf:   90                      nop
    
    0 讨论(0)
  • 2020-11-28 03:19

    You can come pretty damn close (but no cigar) to generating assembly that will reassemble, if that's what you are intending to do, using this rather crude and tediously long pipeline trick (replace /bin/bash with the file you intend to disassemble and bash.S with what you intend to send the output to):

    objdump --no-show-raw-insn -Matt,att-mnemonic -Dz /bin/bash | grep -v "file format" | grep -v "(bad)" | sed '1,4d' | cut -d' ' -f2- | cut -d '<' -f2 | tr -d '>' | cut -f2- | sed -e "s/of\ section/#Disassembly\ of\ section/" | grep -v "\.\.\." > bash.S
    

    Note how long this is, however. I really wish there was a better way (or, for that matter, a disassembler capable of outputting code that an assembler will recognize), but unfortunately there isn't.

    0 讨论(0)
  • 2020-11-28 03:20

    there's also ndisasm, which has some quirks, but can be more useful if you use nasm. I agree with Michael Mrozek that objdump is probably best.

    [later] you might also want to check out Albert van der Horst's ciasdis: http://home.hccnet.nl/a.w.m.van.der.horst/forthassembler.html. it can be hard to understand, but has some interesting features you won't likely find anywhere else.

    0 讨论(0)
  • 2020-11-28 03:28

    An interesting alternative to objdump is gdb. You don't have to run the binary or have debuginfo.

    $ gdb -q ./a.out 
    Reading symbols from ./a.out...(no debugging symbols found)...done.
    (gdb) info functions 
    All defined functions:
    
    Non-debugging symbols:
    0x00000000004003a8  _init
    0x00000000004003e0  __libc_start_main@plt
    0x00000000004003f0  __gmon_start__@plt
    0x0000000000400400  _start
    0x0000000000400430  deregister_tm_clones
    0x0000000000400460  register_tm_clones
    0x00000000004004a0  __do_global_dtors_aux
    0x00000000004004c0  frame_dummy
    0x00000000004004f0  fce
    0x00000000004004fb  main
    0x0000000000400510  __libc_csu_init
    0x0000000000400580  __libc_csu_fini
    0x0000000000400584  _fini
    (gdb) disassemble main
    Dump of assembler code for function main:
       0x00000000004004fb <+0>:     push   %rbp
       0x00000000004004fc <+1>:     mov    %rsp,%rbp
       0x00000000004004ff <+4>:     sub    $0x10,%rsp
       0x0000000000400503 <+8>:     callq  0x4004f0 <fce>
       0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)
       0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax
       0x000000000040050e <+19>:    leaveq 
       0x000000000040050f <+20>:    retq   
    End of assembler dump.
    (gdb) disassemble fce
    Dump of assembler code for function fce:
       0x00000000004004f0 <+0>:     push   %rbp
       0x00000000004004f1 <+1>:     mov    %rsp,%rbp
       0x00000000004004f4 <+4>:     mov    $0x2a,%eax
       0x00000000004004f9 <+9>:     pop    %rbp
       0x00000000004004fa <+10>:    retq   
    End of assembler dump.
    (gdb)
    

    With full debugging info it's even better.

    (gdb) disassemble /m main
    Dump of assembler code for function main:
    9       {
       0x00000000004004fb <+0>:     push   %rbp
       0x00000000004004fc <+1>:     mov    %rsp,%rbp
       0x00000000004004ff <+4>:     sub    $0x10,%rsp
    
    10        int x = fce ();
       0x0000000000400503 <+8>:     callq  0x4004f0 <fce>
       0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)
    
    11        return x;
       0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax
    
    12      }
       0x000000000040050e <+19>:    leaveq 
       0x000000000040050f <+20>:    retq   
    
    End of assembler dump.
    (gdb)
    

    objdump has a similar option (-S)

    0 讨论(0)
  • 2020-11-28 03:30

    This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdump and llvm-objdump.


    Agner Fog's disassembler, objconv, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).

    objconv  -fyasm a.out /dev/stdout | less
    

    (It doesn't recognize - as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asm tacked on.)

    It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.

    It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)

    It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.

    Sample output:

    ; Filling space: 0FH
    ; Filler type: Multi-byte NOP
    ;       db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH
    ;       db 1FH, 84H, 00H, 00H, 00H, 00H, 00H
    
    ALIGN   16
    
    foo:    ; Function begin
            cmp     rdi, 1                                  ; 00400620 _ 48: 83. FF, 01
            jbe     ?_026                                   ; 00400624 _ 0F 86, 00000084
            mov     r11d, 1                                 ; 0040062A _ 41: BB, 00000001
    ?_020:  mov     r8, r11                                 ; 00400630 _ 4D: 89. D8
            imul    r8, r11                                 ; 00400633 _ 4D: 0F AF. C3
            add     r8, rdi                                 ; 00400637 _ 49: 01. F8
            cmp     r8, 3                                   ; 0040063A _ 49: 83. F8, 03
            jbe     ?_029                                   ; 0040063E _ 0F 86, 00000097
            mov     esi, 1                                  ; 00400644 _ BE, 00000001
    ; Filling space: 7H
    ; Filler type: Multi-byte NOP
    ;       db 0FH, 1FH, 80H, 00H, 00H, 00H, 00H
    
    ALIGN   8
    ?_021:  add     rsi, rsi                                ; 00400650 _ 48: 01. F6
            mov     rax, rsi                                ; 00400653 _ 48: 89. F0
            imul    rax, rsi                                ; 00400656 _ 48: 0F AF. C6
            shl     rax, 2                                  ; 0040065A _ 48: C1. E0, 02
            cmp     r8, rax                                 ; 0040065E _ 49: 39. C0
            jnc     ?_021                                   ; 00400661 _ 73, ED
            lea     rcx, [rsi+rsi]                          ; 00400663 _ 48: 8D. 0C 36
    ...
    

    Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like

      (from /lib/x86_64-linux-gnu/libc.so.6)
    
    SECTION .plt    align=16 execute                        ; section number 11, code
    
    ?_00001:; Local function
            push    qword [rel ?_37996]                     ; 0001F420 _ FF. 35, 003A4BE2(rel)
            jmp     near [rel ?_37997]                      ; 0001F426 _ FF. 25, 003A4BE4(rel)
    
    ...    
    ALIGN   8
    ?_00002:jmp     near [rel ?_37998]                      ; 0001F430 _ FF. 25, 003A4BE2(rel)
    
    ; Note: Immediate operand could be made smaller by sign extension
            push    11                                      ; 0001F436 _ 68, 0000000B
    ; Note: Immediate operand could be made smaller by sign extension
            jmp     ?_00001                                 ; 0001F43B _ E9, FFFFFFE0
    

    doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.


    If you don't want to install it objconv, GNU binutils objdump -Mintel -d is very usable, and will already be installed if you have a normal Linux gcc setup.

    0 讨论(0)
  • 2020-11-28 03:31

    ht editor can disassemble binaries in many formats. It is similar to Hiew, but open source.

    To disassemble, open a binary, then press F6 and then select elf/image.

    0 讨论(0)
提交回复
热议问题