GCC - How to realign stack?

前端 未结 5 782
广开言路
广开言路 2021-02-09 05:45

I try to build an application which uses pthreads and __m128 SSE type. According to GCC manual, default stack alignment is 16 bytes. In order to use __m128, the requirement is t

相关标签:
5条回答
  • 2021-02-09 06:20

    This shouldn't be happening in the first place, but to work around the problem you can try:

    void *f(void *x)
    {
       __m128 y __attribute__ ((aligned (16)));
       ...
    }
    
    0 讨论(0)
  • 2021-02-09 06:21

    Allocate on the stack an array that is 15-bytes larger than sizeof(__m128), and use the first aligned address in that array. If you need several, allocate them in an array with a single 15-byte margin for alignment.

    I do not remember if allocating an unsigned char array makes you safe from strict aliasing optimizations by the compiler or if it only works only the other way round.

    #include <stdint.h>
    
    void *f(void *x)
    {
       unsigned char y[sizeof(__m128)+15];
       __m128 *py = (__m128*) (((uintptr_t)&y) + 15) & ~(uintptr_t)15);
       ...
    }
    
    0 讨论(0)
  • 2021-02-09 06:26

    Sorry to resurrect an old thread...

    For those with a newer compiler than OP, OP mentions a -mstackrealign option, which lead me to __attribute__((force_align_arg_pointer)). If your function is being optimized to use SSE, but %ebp is misaligned, this will do the runtime fixes if required for you, transparently. I also found out that this is only an issue on i386. The x86_64 ABI guarantees the arguments are aligned to 16 bytes.

    __attribute__((force_align_arg_pointer)) void i_crash_when_not_aligned_to_16_bytes() { ... }

    Cool article for those who might want to learn more: http://wiki.osdev.org/System_V_ABI

    0 讨论(0)
  • 2021-02-09 06:34

    Another solution would be, to use a padding function, which first aligns the stack and then calls f. So instead of calling f directly, you call pad, which pads the stack first and then calls foowith an aligned stack.

    The code would look like this:

    #include <xmmintrin.h>
    #include <pthread.h>
    
    #define ALIGNMENT 16
    
    void *f(void *x) {
        __m128 y;
        // other stuff
    }
    
    void * pad(void *val) {
        unsigned int x; // to get the current address from the stack
        unsigned char pad[ALIGNMENT - ((unsigned int) &x) % ALIGNMENT];
        return f(val);
    }
    
    int main(void){
        pthread_t p;
        pthread_create(&p, NULL, pad, NULL);
    }
    
    0 讨论(0)
  • 2021-02-09 06:44

    I have solved this problem. Here is my solution:

    void another_function(){
       __m128 y;
       ...
    }
    void *f(void *x){
    asm("pushl    %esp");
    asm("subl    $16,%esp");
    asm("andl    $-0x10,%esp");
    another_function();
    asm("popl %esp");
    }
    

    First, we increase the stack by 16 bytes. Second, we make least-significant nibble equal 0x0. We preserve the stack pointer using push/pop operands. We call another function, which has all its own local variables 16-byte aligned. All nested functions will also have their local variables 16-byte aligned.

    And It works!

    0 讨论(0)
提交回复
热议问题