问题
Hello I am currently working on a program where I need to process a data blob that contains a series of floats which could be unaligned (and also are sometimes). I am compiling with gcc 4.6.2 for an ARM cortex-a8. I have a question to the generated assembly code:
As example I wrote a minimal example: For the following test code
float aligned[2];
float *unaligned = (float*)(((char*)aligned)+2);
int main(int argc, char **argv)
{
float f = unaligned[0];
return (int)f;
}
the compiler (gcc 4.6.2 - with optimization -O3) produces
00008634 <main>:
8634: e30038ec movw r3, #2284 ; 0x8ec
8638: e3403001 movt r3, #1
863c: e5933000 ldr r3, [r3]
8640: edd37a00 vldr s15, [r3]
8644: eefd7ae7 vcvt.s32.f32 s15, s15
8648: ee170a90 vmov r0, s15
864c: e12fff1e bx lr
The compiler here cannot know if the data is aligned but never the less it uses VLDR which needs aligned data or the program will crash with a bus error.
Now here is my actual question: Is this correct from the compiler and I need to take care of alignment in my C++ code or is this a bug in the compiler?
I also might add my current workaround which works and brings gcc to make a copy before accessing the value. The trick is to define a struct which only contains a float with the gcc packed attribute and access the data via a struct pointer. Code snippet:
struct FloatWrapper { float f; } __attribute__((packed));
const FloatWrapper *x = reinterpret_cast<const FloatWrapper *>(rawX.data());
const FloatWrapper *y = reinterpret_cast<const FloatWrapper *>(rawY.data());
for (size_t i = 0; i < vertexCount; ++i) {
vertices[i].x = x[i].f;
vertices[i].y = y[i].f;
}
回答1:
As you have pointed ARM ARM A3.2.1
states regardless of SCTLR.A
value, VLDR
generates Alignment fault
.
I've tested your example on an Cortex-A9 and I got
# float_align
[1] + Stopped (signal) float_align
However, I'm confused also by the ARM Cortex-A8 TRM 4.2.1, it states
If an alignment qualifier is not specified, and A=1, the alignment fault is taken if it is not aligned to element size.
If an alignment qualifier is not specified, and A=0, it is treated as unaligned access.
This is probably a half baked explanation, since ARM ARM
is giving more information with a detailed table on instructions.
So I think answer is, you need to take care of alignment yourself since compiler can't find out which addresses you are loading in all scenarios, like address might be available after linking etc.
来源:https://stackoverflow.com/questions/17184731/gcc-generated-assembly-for-unaligned-float-access-on-arm