GCC to emit ARM idiv instructions

前端 未结 1 952
天涯浪人
天涯浪人 2021-01-02 05:41

How can I instruct gcc to emit idiv (integer division, udiv and sdiv) instructions for arm application processors

相关标签:
1条回答
  • 2021-01-02 06:26

    If the instruction is not in the machine descriptions, then I doubt that gcc will emit code. Note1

    You can always use inline-assembler to get the instruction if the compiler is not supporting it.Note2 Since your op-code is fairly rare/machine specific, there is probably not so much effort to get it in the gcc source. Especially, there are arch and tune/cpu flags. The tune/cpu is for a more specific machine, but the arch is suppose to allow all machines in that architecture. This op-code seems to break that rule, if I understand.

    For gcc 4.6.2, it looks like thumb2 and cortex-r4 are cues to use these instructions and as you have noted with gcc 4.7.2, the cortex-a15 seems to be added to use these instructions. With gcc 4.7.2, the thumb2.md file no longer has udiv/sdiv. However, it might be included somewhere else; I am not 100% familiar with all the machine description language. It also seems that cortex-a7, cortex-a15, and cortex-r5 may enable these instructions with 4.7.2. Note3

    This doesn't answer the question directly, but it does give some information/path to get the answer. You can compile the module with -mcpu=cortex-r4, although this may produce linker issues. Also, there is int my_idiv(int a, int b) __attribute__ ((__target__ ("arch=cortexe-r4")));, where you can specify on a per-function basis the machine-description used by the code generator. I haven't used any of these myself, but they are only possibilities to try. Generally you don't want to keep the wrong machine as it could generate sub-optimal (and possibly illegal) op-codes. You will have to experiment and maybe then provide the real answer.

    Note1: This is for a stock gcc 4.6.2 and 4.7.2. I don't know if your Android compiler has patches.

    gcc-4.6.2/gcc/config/arm$ grep [ius]div *.md
    arm.md: "...,sdiv,udiv,other"
    cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average, 
    cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
    cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
    cortex-r4.md:       (eq_attr "insn" "udiv"))
    cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
    cortex-r4.md:       (eq_attr "insn" "sdiv"))
    thumb2.md:  "sdiv%?\t%0, %1, %2"
    thumb2.md:   (set_attr "insn" "sdiv")]
    thumb2.md:(define_insn "udivsi3"
    thumb2.md:      (udiv:SI (match_operand:SI 1 "s_register_operand"  "r")
    thumb2.md:  "udiv%?\t%0, %1, %2"
    thumb2.md:   (set_attr "insn" "udiv")]
    
    gcc-4.7.2/gcc/config/arm$ grep -i [ius]div *.md
    arm.md:  "...,sdiv,udiv,other"
    arm.md:  "TARGET_IDIV"
    arm.md:  "sdiv%?\t%0, %1, %2"
    arm.md:   (set_attr "insn" "sdiv")]
    arm.md:(define_insn "udivsi3"
    arm.md: (udiv:SI (match_operand:SI 1 "s_register_operand"  "r")
    arm.md:  "TARGET_IDIV"
    arm.md:  "udiv%?\t%0, %1, %2"
    arm.md:   (set_attr "insn" "udiv")]
    cortex-a15.md:(define_insn_reservation "cortex_a15_udiv" 9
    cortex-a15.md:       (eq_attr "insn" "udiv"))
    cortex-a15.md:(define_insn_reservation "cortex_a15_sdiv" 10
    cortex-a15.md:       (eq_attr "insn" "sdiv"))
    cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average, 
    cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
    cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
    cortex-r4.md:       (eq_attr "insn" "udiv"))
    cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
    cortex-r4.md:       (eq_attr "insn" "sdiv"))
    

    Note2: See pre-processor as Assembler if gcc is passing options to gas that prevent use of the udiv/sdiv instructions. For example, you can use asm(" .long <opcode>\n"); where opcode is some token pasted stringified register encode macro output. Also, you can annotate your assembler to specify changes in the machine. So you can temporarily lie and say you have a cortex-r4, etc.

    Note3:

    gcc-4.7.2/gcc/config/arm$ grep -E 'TARGET_IDIV|arm_arch_arm_hwdiv|FL_ARM_DIV' *
    arm.c:#define FL_ARM_DIV    (1 << 23)         /* Hardware divide (ARM mode).  */
    arm.c:int arm_arch_arm_hwdiv;
    arm.c:  arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
    arm-cores.def:ARM_CORE("cortex-a7",  cortexa7,  7A, ... FL_ARM_DIV
    arm-cores.def:ARM_CORE("cortex-a15", cortexa15, 7A, ... FL_ARM_DIV
    arm-cores.def:ARM_CORE("cortex-r5",  cortexr5,  7R, ... FL_ARM_DIV
    arm.h:  if (TARGET_IDIV)                                \
    arm.h:#define TARGET_IDIV               ((TARGET_ARM && arm_arch_arm_hwdiv) \
    arm.h:extern int arm_arch_arm_hwdiv;
    arm.md:  "TARGET_IDIV"
    arm.md:  "TARGET_IDIV"
    
    0 讨论(0)
提交回复
热议问题