Just curious as to how to get started understanding ARM under iOS. Any help would be super nice.
In my opinion, the best way to get started is to
To do this you can use Xcode:
Add the following function to scratchpad.c:
void do_nothing(void)
{
return;
}
If you now refresh the Assembly in the assistant editor, you should see lots of lines starting with dots (directives), followed by
_do_nothing:
@ BB#0:
bx lr
Let's ignore the directives for now and look at these three lines. With a bit of searching on the internet, you'll find out that these lines are:
b
means branch, ignore the x
for now (it has something to do with switching between instruction sets), and lr
is the link register, where callers store the return address.Let's beef it up a bit and change the code to:
extern void do_nothing(void);
void do_nothing_twice(void)
{
do_nothing();
do_nothing();
}
After saving and refreshing the assembly, you get the following code:
_do_nothing_twice:
@ BB#0:
push {r7, lr}
mov r7, sp
blx _do_nothing
pop.w {r7, lr}
b.w _do_nothing
Again, with a bit of searching on the internet, you'll find out the meaning of each line. Some more work needs to be done because make two calls: The first call needs to return to us, so we need to change lr
. That is done by the blx
instruction, which does not only branch to _do_nothing
, but also stores the address of the next instruction (the return address) in lr
.
Because we change the return address, we have to store it somewhere, so it is pushed on the stack. The second jump has a .w
suffixed to it, but let's ignore that for now. Why doesn't the function look like this?
_do_nothing_twice:
@ BB#0:
push {lr}
blx _do_nothing
pop.w {lr}
b.w _do_nothing
That would work as well, but in iOS, the convention is to store the frame pointer in r7
. The frame pointer points to the place in the stack where we store the previous frame pointer and the previous return address.
So what the code does is: First, it pushes r7
and lr
to the stack, then it sets r7
to point to the new stack frame (which is on the top of the stack, and sp
points to the top of the stack), then it branches for the first time, then it restores r7
and lr
, finally it branch for the second time. Abx lr
at the end is not needed, because the called function will return to lr
, which points to our caller.
Let's have a look at a last example:
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
The assembly code is:
_swap:
@ BB#0:
ldr r2, [r0]
ldr r3, [r1]
str r3, [r0]
str r2, [r1]
bx lr
With a bit of searching, you will learn that arguments and return values are stored in registers r0
-r3
, and that we may use those freely for our calculations. What the code does is straightforward: It loads the value that r0
and r1
point to in r2
and r3
, then it stores them back in exchanged order, then it branches back.
That's it: Write small snippets, get enough info to roughly understand what's going on in each line, repeat. Hope that helps!