Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths?

前端 未结 1 1750
盖世英雄少女心
盖世英雄少女心 2021-01-15 13:39

Shall sys_execve() in kernel level code receive absolute or relative path for the filename parameter?

相关标签:
1条回答
  • 2021-01-15 14:24

    sys_execve can take either absolute or relative paths

    Let's verify it in the following ways:

    • experiment with a raw system call
    • read the kernel source
    • run GDB on kernel + QEMU to verify our source analysis

    Experiment

    main.c

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/syscall.h>
    
    int main(void) {
        syscall(__NR_execve, "../main2.out", NULL, NULL);
    }
    

    main2.c

    #include <stdio.h>
    
    int main(void) {
        puts("hello main2");
    }
    

    Compile and run:

    gcc -o main.out main.c
    gcc -o ../main2.out main2.c
    ./main.out
    

    Output:

    hello main2
    

    Tested in Ubuntu 16.10.

    Kernel source

    First, just go into the kernel tree

    git grep '"\.\."' fs
    

    We focus on fs because we know that execve is defined there.

    This immediately gives results like: https://github.com/torvalds/linux/blob/v4.9/fs/namei.c#L1759 which clearly indicate that he kernel knows about ..:

    /*
     * "." and ".." are special - ".." especially so because it has
     * to be able to know about the current root directory and
     * parent relationships.
     */
    

    We then look at the definition of execve https://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1869 and the first thing it does is to call getname() on the input path:

    SYSCALL_DEFINE3(execve,
            const char __user *, filename,
            const char __user *const __user *, argv,
            const char __user *const __user *, envp)
    {
        return do_execve(getname(filename), argv, envp);
    }
    

    getname is defined in fs/namei.c, which is the file where the above ".." quote came from.

    I haven't bothered to follow the full call path, but I bet that getname it ends up doing .. resolution.

    follow_dotdot in the same file looks specially promising.

    GDB + QEMU

    Reading the source is great, but we can never be sure that the code paths are actually used.

    There are two ways to do that:

    • printk, recompile, printk, recompile
    • GDB + QEMU. Setup is a bit rougher, but once done it is pure bliss

    First get the setup working as explained at: How to debug the Linux kernel with GDB and QEMU?

    Now, we will use two programs:

    init.c

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/syscall.h>
    
    int main(void) {
        chdir("d");
        syscall(__NR_execve, "../b.out", NULL, NULL);
    }
    

    b.c

    #include <unistd.h>
    #include <stdio.h>
    
    int main(void) {
        puts("hello");
        sleep(0xFFFFFFFF);
    }
    

    And the rootfs file structure should be like:

    init
    b.out
    d/
    

    Once GDB is running, we will do:

    b sys_execve
    c
    x/s filename
    

    Outputs ../b.out, so we know it is the right syscall.

    Now the interesting ".." comment we had seen before was in a function called walk_component, so let's see if that is called:

    b walk_component
    c
    

    And yes, we hit it.

    If we read a bit into it, we see a call:

    error = handle_dots(nd, nd->last_type);
    

    which sounds promising and does:

    static inline int handle_dots(struct nameidata *nd, int type)
    {
        if (type == LAST_DOTDOT) {
            if (!nd->root.mnt)
                set_root(nd);
            if (nd->flags & LOOKUP_RCU) {
                return follow_dotdot_rcu(nd);
            } else
                return follow_dotdot(nd);
        }
        return 0;
    }
    

    So what is it that sets this type (nd->last_type) to LAST_DOTDOT?

    Well, search the source for = LAST_DOTDOT, and we find that link_path_walk is doing it.

    And even better: bt says that link_path_walk is a caller, so it will be easy to understand what is going on now.

    In link_path_walk, we see:

    if (name[0] == '.') switch (hashlen_len(hash_len)) {
        case 2:
            if (name[1] == '.') {
                type = LAST_DOTDOT;
    

    and thus the mistery is solved: ".." was not the check that was being done, which foiled our previous greps!

    Instead, the two dots were being checked separately (because . is a subcase).

    0 讨论(0)
提交回复
热议问题