问题
I'm trying to write code to do two things: return a 1 to register r2 if my value is presentable as a constant in the ARM data processing instruction. This code does that (offer better methods if it's inefficient please). However, I also want to modify it to tell me whether a MOV or MVN needs to be used.
AREA ArmExample18b, CODE
ENTRY
MOV r2, #0 ;register return value. if =1, representable, otherwise, not representable
LDR r1, TABLE1 ;input value we want to use
LDR r3, TABLE1+4 ;upper bound register
LDR r4, TABLE1+8 ;lower bound register
MOV r5, #12
INVCHECK CLZ r6, r1 ;r6 contains number of leading zeros in r1
RBIT r7, r1
CLZ r8, r7 ;r8 contains number of trailing zeros in r1
CMP r6, r8
SUBCS r9, r6, r8
RSBCC r9, r6, r8
CMP r9, #8
MVNHI r1, r1
BHI INVCHECK
BLS LOOP
LOOP
CMP r3, r1 ;compare input value with upper bound
BLO STOP ;if bigger than u.b, stop, r2 = 0
CMP r4, r1 ;compare input value with lower bound
MOVLS r2, #1 ;if larger than lower bound, it falls within range, set r2 = 1
BLS STOP ;then stop
CMP r4, #0 ;if r4 has reached 0, then we are at the end of comparisons and can stop
BEQ STOP
LDR r3, TABLE1 + r5 ;change upper bound
ADD r5, r5, #4
LDR r4, TABLE1 + r5 ;change lower bound
ADD r5, r5, #4
B LOOP
STOP B STOP
TABLE1 DCD 0x500, 0x3fc0, 0x1000, 0xff0, 0x400, 0x3fc, 0x100, 0xff, 0
END
回答1:
However, I also want to modify it to tell me whether a MOV or MVN needs to be used.
Test for the MOV
case. If no, test for the MVN
case and set a flag (or whatever your API wants). Often people use +1 (MOV), 0 (can not fit), -1 (MVN) as this might be nice to test in the caller pure ARM.
Being completely ignorant, I started by investigating what gas (GNU assembler does). I found the answer in tc-arm.c in a routine called encode_arm_immediate()
. Here is the source,
/* If VAL can be encoded in the immediate field of an ARM instruction,
return the encoded form. Otherwise, return FAIL. */
static unsigned int
encode_arm_immediate (unsigned int val)
{
unsigned int a, i;
for (i = 0; i < 32; i += 2)
if ((a = rotate_left (val, i)) <= 0xff)
return a | (i << 7); /* 12-bit pack: [shift-cnt,const]. */
return FAIL;
}
Some interesting points. It is not very efficient like your example, but it is more correct. I don't think you are handling constants like 0xf000000f which can be represented. Also, the code in move_or_literal_pool()
in the same file has this pseudo code,
if((packed = encode_arm_immediate(val)) == FAIL)
packed = encode_arm_immediate(~val);
It is pretty clear that if you have a test for MOV
, you can complement and test for MVN
. In fact, I don't think you will be more efficient by trying to test each in parallel as you complicate the logic too much. The current steps can be minimized with an instruction to find the first set bit (clz
) as you don't need to iterate over all of the bits [see pop_count()].
bits = pop_count(val);
if(bits <= 8) {
/* Search 'MOV' */ using clz to normalize */
shift = clz(val);
val =<< shift;
if((val & 0xff<<24 == val) && !shift&1) goto it.
if((val & 0xfe<<24 == val) && shift&1) goto it.
/* test for rotation */
}
if(bits >= 32-8) {
/* Set 'MVN' flag */
/* as above */
}
There are various ways to implement a population count and/or run of numbers. Really, if your algorithm is correct and handles the rotation, the simple encode_arm_immediate()
seems like it's simplicity will end up being very competitive to any solution that tries to use advanced instruction to detect runs of bits. The encode_arm_immediate()
will fit in the cache and the loop will be running quickly on an ARMv7 with caches and branch prediction.
回答2:
@artlessnoise has provided a thorough explanation of the way to go about it (that's the 'real' answer IMO), but since this piqued my interest I fancied solving it from scratch. On an ARM7, you don't have all the fancy bit-manipulation instructions of later architectures, but it turns out they're a bit of a red herring here. The straightforward "try every valid rotation until you find one which fits in 8 bits (i.e. <=255)" approach came out to some beautifully compact idiomatic assembly (GNU flavour as I couldn't convince the armcc toolchain to play nicely):
.syntax unified
.cpu arm7tdmi
.globl testconst
testconst:
mov r2, #32
1: mov r1, r0, ror r2
cmp r1, #255
movls r0, #1 @ using EABI registers for the sake of this example
movls pc, lr
cmn r1, #256 @ no good? how about the inverted version then?
movhs r0, #-1 @ note that we'll still have the separated
movhs pc, lr @ value and shift parts in r1 and r2 when we
subs r2, #2 @ return - those might come in handy later
bne 1b
mov r0, #0
mov pc, lr
With this little test program:
#include <stdio.h>
int testconst(int);
void test(int c) {
int r = testconst(c);
printf("%i (%08x) %s\n", c, c,
r > 0 ? "fits MOV" :
r < 0 ? "fits MVN" :
"doesn't work");
}
int main(void) {
test(0);
test(42);
test(-42);
test(0xff);
test(0x1ff);
test(0x81);
test(0x10001);
test(0xff << 12);
test(0xff << 11);
test(~(0xff << 12));
test(~(0x101 << 12));
test(0xf000000f);
test(0xf000001f);
test(~0xf000000f);
test(~0xf800000f);
}
To give the expected results:
/ # ./bittest
0 (00000000) fits MOV
42 (0000002a) fits MOV
-42 (ffffffd6) fits MVN
255 (000000ff) fits MOV
511 (000001ff) doesn't work
129 (00000081) fits MOV
65537 (00010001) doesn't work
1044480 (000ff000) fits MOV
522240 (0007f800) doesn't work
-1044481 (fff00fff) fits MVN
-1052673 (ffefefff) doesn't work
-268435441 (f000000f) fits MOV
-268435425 (f000001f) doesn't work
268435440 (0ffffff0) fits MVN
134217712 (07fffff0) doesn't work
Hurrah!
来源:https://stackoverflow.com/questions/27963312/arm-mov-and-mvn-operand