GCC -mthumb against -marm

前端 未结 2 1613
孤独总比滥情好
孤独总比滥情好 2021-02-07 08:47

I am working on performance optimizations of ARM C/C++ code, compiled with GCC. CPU is Tegra 3. As I know flags -mthumb means generating old 16-bit Thumb instructio

相关标签:
2条回答
  • 2021-02-07 09:24

    Thumb is not the older instruction-set, but in fact the newer one. The current revision being Thumb-2, which is a mixed 16/32-bit instruction set. The Thumb1 instruction set was a compressed version of the original ARM instruction set. The CPU would fetch the the instruction, decompress it into ARM and then process it. These days (ARMv7 and above), Thumb-2 is preferred for everything but performance critical or system code. For example, GCC will by default generate Thumb2 for ARMv7 (Like your Tegra3), as the higher code density provided by the 16/32-bit ISA allows for better icache utilization. But this is something which is very hard to measure in a normal benchmark, because most benchmarks will fit into the L1 icache anyway.

    For more information check the Wikipedia site: http://en.wikipedia.org/wiki/ARM_architecture#Thumb

    0 讨论(0)
  • 2021-02-07 09:43

    ARM is a 32 bit instruction so has more bits to do more things in a single instruction while THUMB with only 16 bits might have to split the same functionality between 2 instructions. Based on the assumption that non-memory instructions took more or less the same time, fewer instructions mean faster code. There were also some things that just couldn't be done with THUMB code.

    The idea was then that ARM would be used for performance critical functionality while THUMB (which fits 2 instructions into a 32 bit word) would be used to minimize storage space of programs.

    As CPU memory caching became more critical, having more instructions in the icache was a bigger determinant of speed than functional density per instruction. This meant that THUMB code became faster than the equivalent ARM code. ARM (corp) therefore created THUMB32 which is a variable length instruction that incorporates most ARM functionality. THUMB32 should in most cases give you denser as well as faster code due to better caching.

    0 讨论(0)
提交回复
热议问题