Kernel sys_call_table address does not match address specified in system.map

一笑奈何 提交于 2020-01-13 17:11:45

问题


I am trying to brush up on C so I have been playing around with the linux kernel's system call table (on 3.13.0-32-generic). I found a resource online that searches for the system call table with the following function which I load into the kernel in an LKM:

static uint64_t **aquire_sys_call_table(void)
{
    uint64_t offset = PAGE_OFFSET;
    uint64_t **sct;

    while (offset < ULLONG_MAX) {
        sct = (uint64_t **)offset;

        if (sct[__NR_close] == (uint64_t *) sys_close) {
            printk("\nsys_call_table found at address: 0x%p\n", sys_call_table);
            return sct;
        }

        offset += sizeof(void *);
    }

    return NULL;
}

The function works. I am able to use the address it returns to manipulate the system call table. What I don't understand is why the address returned by this function doesn't match the address in /boot/System.map-(KERNEL)

Here is what the function prints:

sys_call_table found at address: 0xffff880001801400

Here is what I get when I search system.map

$ sudo cat /boot/System.map-3.13.0-32-generic | grep sys_call_table 
  ffffffff81801400 R sys_call_table
  ffffffff81809cc0 R ia32_sys_call_table

Why don't the two addresses match? Its my understanding that the module runs in the kernel's address space, so the address of the system call table should be the same.


回答1:


The two virtual addresses have the same physical address.

From Documentation/x86/x86_64/mm.txt

<previous description obsolete, deleted>

Virtual memory map with 4 level page tables:

0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff87ffffffffff (=43 bits) guard hole, reserved for hypervisor
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffec0000000000 - fffffc0000000000 (=44 bits) kasan shadow memory (16TB)
... unused hole ...
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ...
ffffffff80000000 - ffffffffa0000000 (=512 MB)  kernel text mapping, from phys 0
ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space
ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls
ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole

The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory
holes).

vmalloc space is lazily synchronized into the different PML4 pages of
the processes using the page fault handler, with init_level4_pgt as
reference.

Current X86-64 implementations only support 40 bits of address space,
but we support up to 46 bits. This expands into MBZ space in the page tables.

->trampoline_pgd:

We map EFI runtime services in the aforementioned PGD in the virtual
range of 64Gb (arbitrarily set, can be raised if needed)

0xffffffef00000000 - 0xffffffff00000000

-Andi Kleen, Jul 2004

we know the virtual address space ffff880000000000-ffffc7ffffffffff is direct mapping of all physical memory. When the kernel wants to access all physical memory, it uses direct mapping. It's also what you use for searching.

And the ffffffff80000000-ffffffffa0000000 is kernel text mapping. When the kernel code executed, rip register uses the kernel text mapping.

In arch/x86/include/asm/page_64.h, we can get the relation of virtual address and physical address.

static inline unsigned long __phys_addr_nodebug(unsigned long x)
{
    unsigned long y = x - __START_KERNEL_map;

    /* use the carry flag to determine if x was < __START_KERNEL_map */
    x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));

    return x;
}

and

// arch/x86/include/asm/page_types.h
#define PAGE_OFFSET     ((unsigned long)__PAGE_OFFSET)
// arch/x86/include/asm/page_64_types.h
#define __START_KERNEL_map  _AC(0xffffffff80000000, UL)
#define __PAGE_OFFSET           _AC(0xffff880000000000, UL)


As for the addresses mentioned in the question above:

what the function prints,

sys_call_table found at address: 0xffff880001801400

what system.map gives,

$ sudo cat /boot/System.map-3.13.0-32-generic | grep sys_call_table 
  ffffffff81801400 R sys_call_table
  ffffffff81809cc0 R ia32_sys_call_table

both of them resolve to same physical address.

virt->phys conversion happens in such way that corresponding addresses in 'direct' mapping region and 'kernel text' mapping region resolve to same physical address.




回答2:


Through the magic of virtual memory mapping, the address you use depends on where you are. The symbol table file System.map is to help attaching a gdb or crash utility to the running system. Inside the kernel, well, is inside the kernel.

You may also have a /proc/kallsym file for even more values :)



来源:https://stackoverflow.com/questions/31396090/kernel-sys-call-table-address-does-not-match-address-specified-in-system-map

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!