Why did ld turn my 5 lines of library-less C into a 100MB binary?

两盒软妹~` 提交于 2020-08-05 06:15:35

问题


I'm trying to develop some very low-level x86 code following this document. I wrote the following C program:

void main()
{
    char* video_memory = (char*) 0xb8000;
    *video_memory = 'X';
}

I compile and link it like so:

gcc -m32 -fno-pie -c main.c -o main.o
ld -m elf_i386 -o main.bin -Ttext 513 --oformat binary main.o

This produces a binary called main.bin which is over a hundred megabytes. I disassembled that binary and it's basically my code (ten or so lines), then a hundred meg of zeros, and then some kind of footer.

The extra bytes are all unnecessary, because I used head to snip off the ones that weren't my code and it still ran fine.

I'm using 32-bit flags because my test machine is an old 32-bit laptop, but you can get similar (but less extreme) behavior in 64-bit. This script:

gcc -fno-pie -c main.c -o main.o
ld -o main.bin -Ttext 513 --oformat binary main.o

produces a main.bin of over 4 MB. Again the pattern is the same: my code, 4 meg of zeros, and then a footer. A little bit of noise in between my code and the zeros. Here's the disassembled 4MB file:

       0:   f3 0f 1e fa             endbr64 
       4:   55                      push   %ebp
       5:   48                      dec    %eax
       6:   89 e5                   mov    %esp,%ebp
       8:   48                      dec    %eax
       9:   c7 45 f8 00 80 0b 00    movl   $0xb8000,-0x8(%ebp)
      10:   48                      dec    %eax
      11:   8b 45 f8                mov    -0x8(%ebp),%eax
      14:   c6 00 58                movb   $0x58,(%eax)
      17:   90                      nop
      18:   5d                      pop    %ebp
      19:   c3                      ret    
    ...
     aea:   00 00                   add    %al,(%eax)
     aec:   00 14 00                add    %dl,(%eax,%eax,1)
     aef:   00 00                   add    %al,(%eax)
     af1:   00 00                   add    %al,(%eax)
     af3:   00 00                   add    %al,(%eax)
     af5:   01 7a 52                add    %edi,0x52(%edx)
     af8:   00 01                   add    %al,(%ecx)
     afa:   78 10                   js     0xb0c
     afc:   01 1b                   add    %ebx,(%ebx)
     afe:   0c 07                   or     $0x7,%al
     b00:   08 90 01 00 00 1c       or     %dl,0x1c000001(%eax)
     b06:   00 00                   add    %al,(%eax)
     b08:   00 1c 00                add    %bl,(%eax,%eax,1)
     b0b:   00 00                   add    %al,(%eax)
     b0d:   f3 f4                   repz hlt 
     b0f:   ff                      (bad)  
     b10:   ff 1a                   lcall  *(%edx)
     b12:   00 00                   add    %al,(%eax)
     b14:   00 00                   add    %al,(%eax)
     b16:   45                      inc    %ebp
     b17:   0e                      push   %cs
     b18:   10 86 02 43 0d 06       adc    %al,0x60d4302(%esi)
     b1e:   51                      push   %ecx
     b1f:   0c 07                   or     $0x7,%al
     b21:   08 00                   or     %al,(%eax)
    ...
  3ffaeb:   00 00                   add    %al,(%eax)
  3ffaed:   04 00                   add    $0x0,%al
  3ffaef:   00 00                   add    %al,(%eax)
  3ffaf1:   10 00                   adc    %al,(%eax)
  3ffaf3:   00 00                   add    %al,(%eax)
  3ffaf5:   05 00 00 00 47          add    $0x47000000,%eax
  3ffafa:   4e                      dec    %esi
  3ffafb:   55                      push   %ebp
  3ffafc:   00 02                   add    %al,(%edx)
  3ffafe:   00 00                   add    %al,(%eax)
  3ffb00:   c0 04 00 00             rolb   $0x0,(%eax,%eax,1)
  3ffb04:   00 03                   add    %al,(%ebx)
  3ffb06:   00 00                   add    %al,(%eax)
  3ffb08:   00 00                   add    %al,(%eax)
  3ffb0a:   00 00                   add    %al,(%eax)
    ...

The giant binary files works, but it's ugly and I'd like to understand what's going on.

I'm doing the compilation/linking on Ubuntu 20.20 on a 64-bit machine. Tool versions:

gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2) 
GNU ld (GNU Binutils for Ubuntu) 2.34

来源:https://stackoverflow.com/questions/63106614/why-did-ld-turn-my-5-lines-of-library-less-c-into-a-100mb-binary

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!