How can I instruct gcc
to emit idiv
(integer division, udiv
and sdiv
) instructions for arm application processors
If the instruction is not in the machine descriptions, then I doubt that gcc
will emit code. Note1
You can always use inline-assembler to get the instruction if the compiler is not supporting it.Note2 Since your op-code
is fairly rare/machine specific, there is probably not so much effort to get it in the gcc
source. Especially, there are arch and tune/cpu flags. The tune/cpu is for a more specific machine, but the arch is suppose to allow all machines in that architecture. This op-code
seems to break that rule, if I understand.
For gcc
4.6.2, it looks like thumb2 and cortex-r4 are cues to use these instructions and as you have noted with gcc
4.7.2, the cortex-a15 seems to be added to use these instructions. With gcc
4.7.2, the thumb2.md file no longer has udiv
/sdiv
. However, it might be included somewhere else; I am not 100% familiar with all the machine description language. It also seems that cortex-a7, cortex-a15, and cortex-r5 may enable these instructions with 4.7.2. Note3
This doesn't answer the question directly, but it does give some information/path to get the answer. You can compile the module with -mcpu=cortex-r4
, although this may produce linker issues. Also, there is int my_idiv(int a, int b) __attribute__ ((__target__ ("arch=cortexe-r4")));
, where you can specify on a per-function basis the machine-description used by the code generator. I haven't used any of these myself, but they are only possibilities to try. Generally you don't want to keep the wrong machine as it could generate sub-optimal (and possibly illegal) op-codes. You will have to experiment and maybe then provide the real answer.
Note1: This is for a stock gcc
4.6.2 and 4.7.2. I don't know if your Android compiler has patches.
gcc-4.6.2/gcc/config/arm$ grep [ius]div *.md
arm.md: "...,sdiv,udiv,other"
cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average,
cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
cortex-r4.md: (eq_attr "insn" "udiv"))
cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
cortex-r4.md: (eq_attr "insn" "sdiv"))
thumb2.md: "sdiv%?\t%0, %1, %2"
thumb2.md: (set_attr "insn" "sdiv")]
thumb2.md:(define_insn "udivsi3"
thumb2.md: (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
thumb2.md: "udiv%?\t%0, %1, %2"
thumb2.md: (set_attr "insn" "udiv")]
gcc-4.7.2/gcc/config/arm$ grep -i [ius]div *.md
arm.md: "...,sdiv,udiv,other"
arm.md: "TARGET_IDIV"
arm.md: "sdiv%?\t%0, %1, %2"
arm.md: (set_attr "insn" "sdiv")]
arm.md:(define_insn "udivsi3"
arm.md: (udiv:SI (match_operand:SI 1 "s_register_operand" "r")
arm.md: "TARGET_IDIV"
arm.md: "udiv%?\t%0, %1, %2"
arm.md: (set_attr "insn" "udiv")]
cortex-a15.md:(define_insn_reservation "cortex_a15_udiv" 9
cortex-a15.md: (eq_attr "insn" "udiv"))
cortex-a15.md:(define_insn_reservation "cortex_a15_sdiv" 10
cortex-a15.md: (eq_attr "insn" "sdiv"))
cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average,
cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
cortex-r4.md: (eq_attr "insn" "udiv"))
cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
cortex-r4.md: (eq_attr "insn" "sdiv"))
Note2: See pre-processor as Assembler if gcc
is passing options to gas
that prevent use of the udiv/sdiv
instructions. For example, you can use asm(" .long <opcode>\n");
where opcode is some token pasted stringified register encode macro output. Also, you can annotate your assembler to specify changes in the machine
. So you can temporarily lie and say you have a cortex-r4, etc.
Note3:
gcc-4.7.2/gcc/config/arm$ grep -E 'TARGET_IDIV|arm_arch_arm_hwdiv|FL_ARM_DIV' *
arm.c:#define FL_ARM_DIV (1 << 23) /* Hardware divide (ARM mode). */
arm.c:int arm_arch_arm_hwdiv;
arm.c: arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
arm-cores.def:ARM_CORE("cortex-a7", cortexa7, 7A, ... FL_ARM_DIV
arm-cores.def:ARM_CORE("cortex-a15", cortexa15, 7A, ... FL_ARM_DIV
arm-cores.def:ARM_CORE("cortex-r5", cortexr5, 7R, ... FL_ARM_DIV
arm.h: if (TARGET_IDIV) \
arm.h:#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
arm.h:extern int arm_arch_arm_hwdiv;
arm.md: "TARGET_IDIV"
arm.md: "TARGET_IDIV"