tl;dr : I'm trying to execute dynamically some code from another snippet. But I am stuck with handling memory reference (e.g. mov 40200b, %rdi
): can I patch my code or the snippet running code so that 0x40200b
is resolved correctly (as the offset 200b
from the code)?
To generate the code to be executed dynamically I start from a (kernel) object and I resolve the references using ld.
#!/usr/bin/python
import os, subprocess
if os.geteuid() != 0:
print('Run this as root')
exit(-1)
with open("/proc/kallsyms","r") as f:
out=f.read()
sym= subprocess.Popen( ['nm', 'ebbchar.ko', '-u' ,'--demangle', '-fposix'],stdout=subprocess.PIPE)
v=''
for sym in sym.stdout:
s = " "+ sym.split()[0]+ "\n"
off = out.find(s)
v += "--defsym "+s.strip() + "=0x" +out[off-18:off -2]+" "
print(v)
os.system("ld ebbchar.ko "+ v +"-o ebbchar.bin");
I then transmit the code to be executed with through a mmaped file
int fd = open(argv[1], O_RDWR | O_SYNC);
address1 = mmap(NULL, page_size, PROT_WRITE|PROT_READ , MAP_SHARED, fd, 0);
int in=open(argv[2],O_RDONLY);
sz= read(in, buf+8,BUFFER_SIZE-8);
uint64_t entrypoint=atol(argv[3]);
*((uint64_t*)buf)=entrypoint;
write(fd, buf, min(sz+8, (size_t) BUFFER_SIZE));
I execute code dynamycally with this code
struct mmap_info *info;
copy_from_user((void*)(&info->offset),buf,8);
copy_from_user(info->data, buf+8, sz-8);
unsigned long (*func)(void) func= (void*) (info->data + info->offset);
int ret= func();
This approch work for code that don't access memory such as "\x55\x48\x89\xe5\xc7\x45\xf8\x02\x00\x00\x00\xc7\x45\xfc\x03\x00\x00\x00\x8b\x55\xf8\x8b\x45\xfc\x01\xd0\x5d\xc3"
but I have problems when memory is involved.
See example below.
Let's assume i wan't execute dynamically the function vm_close. Objdump -d -S
returns:
0000000000401017 <vm_close>:
{
401017: e8 e4 07 40 81 callq ffffffff81801800 <__fentry__>
printk(KERN_INFO "vm_close");
40101c: 48 c7 c7 0b 20 40 00 mov $0x40200b,%rdi
401023: e9 b6 63 ce 80 jmpq ffffffff810e73de <printk>
At execution, my function pointer points to the right code:
(gdb) x/12x $rip
0xffffc90000c0601c: 0x48 0xc7 0xc7 0x0b 0x20 0x40 0x00 0xe9
0xffffc90000c06024: 0xb6 0x63 0xce 0x80
(gdb) x/2i $rip
=> 0xffffc90000c0601c: mov $0x40200b,%rdi
0xffffc90000c06023: jmpq 0xffffc8ff818ec3de
BUT, this code will fail since:
1) In my context $0x40200b points at the physical address $0x40200b
, and not offset 200b
from the beginning of the code.
2) I don't understand why but the address displayed there is actually different from the correct one (0xffffc8ff818ec3de != ffffffff810e73de) so it won't point on my symbol and will crash.
Is there a way to solve my 2 issues?
Also, I had trouble to find good documentation related to my issue (low-level memory resolution), if you could give me some, that would really help me.
Edit: Since I run the code in the kernel I cannot simply compile the code with -fPIC
or -fpie
which is not allowed by gcc (cc1: error: code model kernel does not support PIC mode
)
Edit 24/09:
According to @Peter Cordes comment, I recompiled it adding mcmodel=small -fpie -mno-red-zone -mnosse
to the Makefile (/lib/modules/$(uname -r)fixed/build/Makefile
)
This is better than in the original version since the generated code before linking is now:
0000000000000018 <vm_close>:
{
18: ff 15 00 00 00 00 callq *0x0(%rip) # 1e <vm_close+0x6>
printk(KERN_INFO "vm_close");
1e: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 25 <vm_close+0xd>
25: e8 00 00 00 00 callq 2a <vm_close+0x12>
}
2a: c3 retq So thanks to rip-relative addressing
Thus I’m now able to access the other variables on my script…
Thus, after linking I can successfully access my variable embedded within the buffer.
40101e: 48 8d 3d e6 0f 00 00 lea 0xfe6(%rip),%rdi # 40200b
Still, one problem remains:
The symbol I want to access (printk
) and my executable buffer are in different address spaces, for exemple:
printk=0xffffffff810e73de:
Executable_buffer=0xffffc9000099d000
But in my callq
to printk
, I have only 32 bits to write the address to call as an offset from $rip
since there is no .got
section in the kernel. This means that printk has to be located within [$rip-2GO, $rip+2GO]
. But this is not the case there.
Do I have a way to access the printk address although they are located more than 2GO away from my buffer (I tried to used mcmodel=medium
but I haven't seen any difference in the generated code), for instance by modifying gcc options so that the binary actually have a .got
section?
Or is there a reliable way to force my executable and potentially-too large-for-kmalloc buffer to be allocated in the [0xffffffff00000000 ; 0xffffffffffffffff] range?
(I currently use __vmalloc(BUFFER_SIZE, GFP_KERNEL, PAGE_KERNEL_EXEC);
)
Edit 27/09:
I succedded in allocationg my buffer in the [0xffffffff00000000 ; 0xffffffffffffffff]
range using the non exported __vmalloc_node_range
function as a (dirty) hack.
IMPORTED(__vmalloc_node_range)(BUFFER_SIZE, MODULE_ALIGN,
MODULES_VADDR + get_module_load_offset(),
MODULES_END, GFP_KERNEL,
PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
__builtin_return_address(0));
Then, when I know the address of my executable buffer and the address of the kernel symbols (by parsing /proc/kallsyms
), I can patch my binary using ld
’s option --defsym symbol=relative_address
where relative_address = symbol_address - buffer_offset
.
Despite being extremely dirt, this approach actually works.
But I need to relink my binary each time I execute it since the buffer may (and will) be allocated at a different address. To solve this issue, I think the best way would be to build my executable as a real position independent executable so that I can just patch the global offset table and not fully relink the module.
But with the options provided there I got a rip-relative address but no got/plt. So I'd like to find a way to build my module as a proper PIE.
This post is getting huge, messy and we are deviating from the original question. Thus, I opened a new simplified post there. If I get interesting answers, I'll edit this post to explain them.
Note: For the sake of simplicity, safety tests are not displayed there
Note 2: I am perfectly aware that my PoC is very unusual and can be a bad practice but I'd like to do it anyway.
来源:https://stackoverflow.com/questions/57971842/dynamic-c-code-execution-memory-references