Speed of accessing local vs. global variables in gcc/g++ at different optimization levels

前端 未结 2 1324
小蘑菇
小蘑菇 2020-12-30 03:48

I found that different compiler optimization levels in gcc give quite different results when accessing a local or a global variable in a loop. The reason this surprised me i

相关标签:
2条回答
  • 2020-12-30 04:27

    A local variable tmp whose address is not taken cannot be pointed to by the pointer p, and the compiler can optimize accordingly. It is much more difficult to infer that a global variable global is not pointed to, unless it's static, because the address of that global variable could be taken in another compilation unit and passed around.

    If reading the assembly indicates that the compiler forces itself to load from memory more often than you would expect, and you know that the aliasing it worries about cannot exist in practice, you can help it by copying the global variable into a local variable at the top of the function and using only the local in the rest of the function.

    Finally, note that if pointer p had been of another type, the compiler could have invoked "strict aliasing rules" to optimize regardless of its inability to infer that p does not point to global. But because lvalues of type char are often used to observe the representation of other types, there is an allowance for this kind of alias, and the compiler cannot take this shortcut in your example.

    0 讨论(0)
  • 2020-12-30 04:32

    Global variable = global memory, and subject to aliasing (read as: bad for the optimizer -- must read-modify-write in the worst case).

    Local variable = register (unless the compiler really can't help it, sometimes it must put it on the stack too, but the stack is practically guaranteed to be in L1)

    Accessing a register is on the order of a single cycle, accessing memory is on the order of 15-1000 cycles (depending on whether the cache line is in cache and not invalidated by another core, and depending on whether the page is in the TLB).

    0 讨论(0)
提交回复
热议问题