How to decrease the size of generated binaries?

后端 未结 7 1237
谎友^
谎友^ 2021-02-01 05:53

I know that there is an option \"-Os\" to \"Optimize for size\", but it has little affect, or even increase the size on some occasion :(

strip (or \"-s\" option) remov

相关标签:
7条回答
  • 2021-02-01 06:20

    It also depends on the architecture you are using.

    On arm, you have the Thumb instruction set that is here to reduce the generated code size.

    You can also avoid dynamic linking and prefer static linking for libs only used by your program or very few programs on your system. This will not decrease the size of your generated binary per se, but overall, you will use less space on your system for this program.

    0 讨论(0)
  • 2021-02-01 06:21

    When using strip(1), you'll want to make sure you use all the relevant options. For some reason, --strip-all doesn't always strip everything. Removing unnecessary sections may be helpful.

    Ultimately, though, the best way to reduce the size of the binary is to remove code and static data from the program. Make it do less, or select programming constructs that result in fewer instructions. For example, you might build data structures at runtime, or load them from a file, on-demand, rather than have a statically initialized array.

    0 讨论(0)
  • 2021-02-01 06:25

    Apart from the obvious (-Os -s), aligning functions to the smallest possible value that will not crash (I don't know ARM alignment requirements) might squeeze out a few bytes per function.
    -Os should already disable aligning functions, but this might still default to a value like 4 or 8. If aligning e.g. to 1 is possible with ARM, that might save some bytes.

    -ffast-math (or the less abrasive -fno-math-errno) will not set errno and avoid some checks, which reduces code size. If, like most people, you don't read errno anyway, that's an option.

    Properly using __restrict (or restrict) and const removes redundant loads, making code both faster and smaller (and more correct). Properly marking pure functions as such eleminates function calls.

    Enabling LTO may help, and if that is not available, compiling all source files into a binary in one go (gcc foo.c bar.c baz.c -o program instead of compiling foo.c, bar.c, and baz.c to object files first and then linking) will have a similar effect. It makes everything visible to the optimizer at one time, possibly allowing it to work better.

    -fdelete-null-pointer-checks may be an option (note that this is normally enabled with any "O", but not on embedded targets).

    Putting static globals (you hopefully don't have that many, but still) into a struct can eleminate a lot of overhead initializing them. I learned that when writing my first OpenGL loader. Having all the function pointers in a struct and initializing the struct with = {} generates one call to memset, whereas initializing the pointers the "normal way" generates a hundred kilobytes of code just to set each one to zero individually.

    Avoid non-trivial-constructor static local variables like the devil (POD types are no problem). Gcc will initialize non-trivial-constructor static locals threadsafe unless you compile with -fno-threadsafe-statics, which links in a lot of extra code (even if you don't use threads at all).

    Using something like libowfat instead of the normal crt can greatly reduce your binary size.

    0 讨论(0)
  • 2021-02-01 06:28

    You can also use -nostartfiles and/or -nodefaultlibs or the combo of both -nostdlib. In case you don't want a standard start file, you must write your own _start function then. See also this thread (archived) on oompf:

    (quoting Perrin)

    # man syscalls
    # cat phat.cc
    extern "C" void _start() {
            asm("int $0x80" :: "a"(1), "b"(42));
    }
    # g++ -fno-exceptions -Os -c phat.cc
    # objdump -d phat.o
    
    phat.o:     file format elf64-x86-64
    
    Disassembly of section .text:
    
    0000000000000000 <_start>:
       0:   53                      push   %rbx
       1:   b8 01 00 00 00          mov    $0x1,%eax
       6:   bb 2a 00 00 00          mov    $0x2a,%ebx
       b:   cd 80                   int    $0x80
       d:   5b                      pop    %rbx
       e:   c3                      retq
    # ld -nostdlib -nostartfiles phat.o -o phat
    # sstrip phat
    # ls -l phat
    -rwxr-xr-x 1 tbp src 294 2007-04-11 22:47 phat
    # ./phat; echo $?
    42
    

    Summary: Above snippet yielded a binary of 294 bytes, each byte 8 bits.

    0 讨论(0)
  • 2021-02-01 06:37

    Assuming that another tool is also allowed ;-)

    Then consider UPX: the Ultimate Packer for Binaries which uses runtime decompression.

    Happy coding.

    0 讨论(0)
  • 2021-02-01 06:38

    You can try playing with -fdata-sections, -ffunction-sections and -Wl,--gc-sections, but this is not safe, so be sure to understand how they work before using them.

    0 讨论(0)
提交回复
热议问题