问题
I'm starting to use the Intel reference page to look up and learn about the op codes (instead of asking everything on SO). I'd like to make sure that my understanding is OK and ask a few questions on the output between a basic asm program and the intel instruction codes.
Here is the program I have to compare various mov
instructions into the rax
-ish register (is there a better way to say "rax" and its 32- 16- and 8- bit components?):
.globl _start
_start:
movq $1, %rax # move immediate into 8-byte rax (rax)
movl $1, %eax # move immediate into 4-byte rax (eax)
movw $1, %ax # move immediate into 2-byte rax (ax)
movb $1, %al # move immediate into 1-byte rax (al)
mov $60, %eax
syscall
And it disassembles as follows:
$ objdump -D file
file: file format elf64-x86-64
Disassembly of section .text:
0000000000400078 <_start>:
400078: 48 c7 c0 01 00 00 00 mov $0x1,%rax
40007f: b8 01 00 00 00 mov $0x1,%eax
400084: 66 b8 01 00 mov $0x1,%ax
400088: b0 01 mov $0x1,%al
40008a: b8 3c 00 00 00 mov $0x3c,%eax
40008f: 0f 05 syscall
Now, matching up to the intel codes from MOV, copied here:
I am able to reconcile the following of the four instructions:
mov $0x1,%al
-->b0 01
YES, intel states code isb0
[+ 1 byte for value] for 1-byte move immediate.mov $0x1,%eax
-->b8 01 00 00 00
YES, intel states code isb8
[+ 4 bytes for value] for 1-byte move immediate.mov $0x1,%ax
-->66 b8 01 00
NO, intel states code isb8
not66 b8
.mov $0x1,%rax48
-->c7 c0 01 00 00 00
N/A, 32 bit instructions only. Not listed.
From this, my question related to this are:
- Why doesn't the
mov $0x1,%ax
match up? - Is there the same table for
64
-bit codes, or what's the suggested way to look that up? - Finally, how do the codes adjust when the register changes? For example, if I want to move a value to
%ebx
or%r11
instead. How do you calculate the 'code-adjustment', as it looks like in this lookup table it only gives (I think?) theeax
register for the 'register example codes'.
回答1:
You're missing the (concept of) prefix "opcodes" that change the meaning of the following instruction. Volume 2, sections 2.1.1 and 2.2.1 of the IA32 manual covers this. From 2.1.1 we get:
Operand-size override prefix is encoded using 66H (66H is also used as a mandatory prefix for some instructions).
so the 66 prefix changes the operand size from the default 32-bit to 16-bit. Thus, the mov $1,%ax
(16-bit) is the same as mov $1,%eax
(32-bit) with just the 66 prefix
The last case (mov $1, %rax
) is actually using a different instruction
REX.W + C7 /0 io MOV r/m64, imm32 Move imm32 sign extended to 64-bits tor/m64.
here we're moving a constant into any register instead of A -- the instruction is one byte larger but allows moving a 32-bit immed into a 64-bit register, so only needs a 4-byte constant instead of an 8-byte one (so ends up being 3 bytes smaller than the equivalent 48 b8 01 00 00 00 00 00 00 00)
来源:https://stackoverflow.com/questions/63875061/matching-the-intel-codes-to-disassembly-output