Required alignment of .text versus .data

故事扮演 提交于 2021-02-08 06:50:21

问题


I've been toying around with the ELFIO library. One of the examples, in particular, allows one to create an ELF file from scratch – defining sections, segments, entry point, and providing binary content to the relevant sections.

I noticed that a program created this way segfaults when the code segment alignment is chosen less than the page size (0x1000):

// Create a loadable segment
segment* text_seg = writer.segments.add();
text_seg->set_type( PT_LOAD );
text_seg->set_virtual_address( 0x08048000 );
text_seg->set_physical_address( 0x08048000 );
text_seg->set_flags( PF_X | PF_R );
text_seg->set_align( 0x1000 ); // can't change this

NB that the .text section is only aligned to multiples of 0x10 in the same example:

section* text_sec = writer.sections.add( ".text" );
text_sec->set_type( SHT_PROGBITS );
text_sec->set_flags( SHF_ALLOC | SHF_EXECINSTR );
text_sec->set_addr_align( 0x10 );

However, the data segment, although loaded separately through the same mechanism, does not have this problem:

segment* data_seg = writer.segments.add();
data_seg->set_type( PT_LOAD );
data_seg->set_virtual_address( 0x08048020 );
data_seg->set_physical_address( 0x08048020 );
data_seg->set_flags( PF_W | PF_R );
data_seg->set_align( 0x10 ); // note here!

Now in this specific case the data fits by design within the page that's already allocated. Not sure if this makes any difference, but I changed its virtual address to 0x8148020 and the result still works fine.

Here's the output of readelf:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0x0000000008048000 0x0000000008048000
                 0x000000000000001d 0x000000000000001d  R E    1000
  LOAD           0x0000000000001020 0x0000000008148020 0x0000000008148020
                 0x000000000000000e 0x000000000000000e  RW     10

Why does the program fail to execute when the alignment of the executable segment is not a multiple of 0x1000 but for data 0x10 is no problem?


Update: Somehow on a second try text_seg->set_align( 0x100 ); works too, text_seg->set_align( 0x10 ); fails. The page size is 0x1000 and interestingly, the working program's VirtAddr does not adhere to it in either segment:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000100 0x08048100 0x08048100 0x0001d 0x0001d R E 0x100
  LOAD           0x000120 0x08148120 0x08148120 0x0000e 0x0000e RW  0x10

The SIGSEGV'ing one:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000080 0x08048100 0x08048100 0x0001d 0x0001d R E 0x10
  LOAD           0x0000a0 0x08148120 0x08148120 0x0000e 0x0000e RW  0x10

Resulting ELFs are here.


回答1:


The ELF ABI does not require that VirtAddr or PhysAddr be page-aligned. (I believe) it only requires that

({Virt,Phys}Addr - Offset) % PageSize == 0

That is true for both working binaries, and false for the non-working one.

Update:

I don't see how this fails for the latter.

We have: VirtAddr == 0x08048100 and Offset == 0x80 (and PageSize == 4096 == 0x1000).

(0x08048100 - 0x80) % 0x1000 == 0x80 != 0

has to agree when align == 0x10, doesn't it?

No: it has to agree with the page size (as I said earlier), or the kernel will not be able to mmap the segment.




回答2:


(sorry, more an extended comment than an answer)

There are some specifications about what an ELF executable should be. Read in particular elf(5) and most importantly the relevant ABI specification (see also this question), e.g. on https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI

AFAIU, these specifications require both code and data segments to be page-aligned, but you need to check that, notably in chapter 5 (program loading and dynamic linking) of the ABI spec.

Current tools generating ELF executables (notably binutils) are working hard to respect these specifications. If you code some ELF generator, you should also try hard to respect these specifications (so testing that the generated ELF apparently works is not enough).

The kernel is implementing execve(2), and dynamic loading is done also with ld-linux(8) using mmap(2). For some reasons (performance probably) it does not check that an executable obeys all the specifications.

(of course, kernel folks want commonly produced ELF executables to be successfully execve-d)

In some corner cases (like the one you observe) the kernel is not failing execve and doing something with ill-constructed ELF files.

But IMHO there is no guarantee about that. Future kernels, and future x86-64 processors, might fail on such ill-constructed ELF files.

My feeling is that you are in some grey area, a bit some "undefined behavior" of execve. If it happens to work, it is by bad luck.

Why does the program fail to execute when the alignment of the executable segment is not a multiple of 0x1000 but for data 0x10 is no problem?

To understand precisely why, you need to dive into the source code (related to execve) of your particular kernel. And I believe that it might perhaps change in the future (future versions of the kernel).

The kernel community is more or less promising past-compatibility, but this is with the specification. It could happen that some ill-formed ELF executable could be execve-d by Linux 3.10 but not by Linux 4.13 or some future Linux 5.

(I did read that some past kernels have been able to execve-d some ill formed ELF executables, but I forgot the details, perhaps something related to the 16 bytes alignment of stack pointers)



来源:https://stackoverflow.com/questions/46117065/required-alignment-of-text-versus-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!