gcc memory alignment pragma

柔情痞子 提交于 2019-11-30 03:41:26

问题


Does gcc have memory alignment pragma, akin #pragma vector aligned in Intel compiler? I would like to tell compiler to optimize particular loop using aligned loads/store instructions. to avoid possible confusion, this is not about struct packing.

e.g:

#if defined (__INTEL_COMPILER)
#pragma vector aligned
#endif
        for (int a = 0; a < int(N); ++a) {
            q10 += Ix(a,0,0)*Iy(a,1,1)*Iz(a,0,0);
            q11 += Ix(a,0,0)*Iy(a,0,1)*Iz(a,1,0);
            q12 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,0,1);
            q13 += Ix(a,1,0)*Iy(a,0,0)*Iz(a,0,1);
            q14 += Ix(a,0,0)*Iy(a,1,0)*Iz(a,0,1);
            q15 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,1,1);
        }

Thanks


回答1:


You can tell GCC that a pointer points to aligned memory by using a typedef to create an over-aligned type that you can declare pointers to.

This helps gcc but not clang7.0 or ICC19, see the x86-64 non-AVX asm they emit on Godbolt. (Only GCC folds a load into a memory operand for mulps, instead of using a separate movups). You have have to use __builtin_assume_aligned if you want to portably convey an alignment promise to GNU C compilers other than GCC itself.


From http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html

typedef double aligned_double __attribute__((aligned (16)));
// Note: sizeof(aligned_double) is 8, not 16
void some_function(aligned_double *x, aligned_double *y, int n)
{
    for (int i = 0; i < n; ++i) {
        // math!
    }
}

This won't make aligned_double 16 bytes wide. This will just make it aligned to a 16-byte boundary, or rather the first one in an array will be. Looking at the disassembly on my computer, as soon as I use the alignment directive, I start to see a LOT of vector ops. I am using a Power architecture computer at the moment so it's altivec code, but I think this does what you want.

(Note: I wasn't using double when I tested this, because there altivec doesn't support double floats.)

You can see some other examples of autovectorization using the type attributes here: http://gcc.gnu.org/projects/tree-ssa/vectorization.html




回答2:


I tried your solution with g++ version 4.5.2 (both Ubuntu and Windows) and it did not vectorize the loop.

If the alignment attribute is removed then it vectorizes the loop, using unaligned loads.

If the function is inlined so that the array can be accessed directly with the pointer eliminated, then it is vectorized with aligned loads.

In both cases, the alignment attribute prevents vectorization. This is ironic: The "aligned_double *x" was supposed to enable vectorization but it does the opposite.

Which compiler was it that reported vectorized loops for you? I suspect it was not a gcc compiler?




回答3:


Does gcc have memory alignment pragma, akin #pragma vector aligned

It looks like newer versions of GCC have __builtin_assume_aligned:

Built-in Function: void * __builtin_assume_aligned (const void *exp, size_t align, ...)

This function returns its first argument, and allows the compiler to assume that the returned pointer is at least align bytes aligned. This built-in can have either two or three arguments, if it has three, the third argument should have integer type, and if it is nonzero means misalignment offset. For example:

void *x = __builtin_assume_aligned (arg, 16);

means that the compiler can assume x, set to arg, is at least 16-byte aligned, while:

void *x = __builtin_assume_aligned (arg, 32, 8);

means that the compiler can assume for x, set to arg, that (char *) x - 8 is 32-byte aligned.

Based on some other questions and answers on Stack Overflow circa 2010, it appears the built-in was not available in GCC 3 and early GCC 4. But I do not know where the cut-off point is.



来源:https://stackoverflow.com/questions/2687210/gcc-memory-alignment-pragma

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!