Shall sys_execve()
in kernel level code receive absolute or relative path for the filename
parameter?
sys_execve
can take either absolute or relative paths
Let's verify it in the following ways:
Experiment
main.c
#define _GNU_SOURCE
#include
#include
int main(void) {
syscall(__NR_execve, "../main2.out", NULL, NULL);
}
main2.c
#include
int main(void) {
puts("hello main2");
}
Compile and run:
gcc -o main.out main.c
gcc -o ../main2.out main2.c
./main.out
Output:
hello main2
Tested in Ubuntu 16.10.
Kernel source
First, just go into the kernel tree
git grep '"\.\."' fs
We focus on fs
because we know that execve
is defined there.
This immediately gives results like: https://github.com/torvalds/linux/blob/v4.9/fs/namei.c#L1759 which clearly indicate that he kernel knows about ..
:
/*
* "." and ".." are special - ".." especially so because it has
* to be able to know about the current root directory and
* parent relationships.
*/
We then look at the definition of execve https://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1869 and the first thing it does is to call getname()
on the input path:
SYSCALL_DEFINE3(execve,
const char __user *, filename,
const char __user *const __user *, argv,
const char __user *const __user *, envp)
{
return do_execve(getname(filename), argv, envp);
}
getname
is defined in fs/namei.c
, which is the file where the above ".."
quote came from.
I haven't bothered to follow the full call path, but I bet that getname
it ends up doing ..
resolution.
follow_dotdot
in the same file looks specially promising.
GDB + QEMU
Reading the source is great, but we can never be sure that the code paths are actually used.
There are two ways to do that:
printk
, recompile, printk
, recompileFirst get the setup working as explained at: How to debug the Linux kernel with GDB and QEMU?
Now, we will use two programs:
init.c
#define _GNU_SOURCE
#include
#include
int main(void) {
chdir("d");
syscall(__NR_execve, "../b.out", NULL, NULL);
}
b.c
#include
#include
int main(void) {
puts("hello");
sleep(0xFFFFFFFF);
}
And the rootfs
file structure should be like:
init
b.out
d/
Once GDB is running, we will do:
b sys_execve
c
x/s filename
Outputs ../b.out
, so we know it is the right syscall.
Now the interesting ".."
comment we had seen before was in a function called walk_component
, so let's see if that is called:
b walk_component
c
And yes, we hit it.
If we read a bit into it, we see a call:
error = handle_dots(nd, nd->last_type);
which sounds promising and does:
static inline int handle_dots(struct nameidata *nd, int type)
{
if (type == LAST_DOTDOT) {
if (!nd->root.mnt)
set_root(nd);
if (nd->flags & LOOKUP_RCU) {
return follow_dotdot_rcu(nd);
} else
return follow_dotdot(nd);
}
return 0;
}
So what is it that sets this type
(nd->last_type
) to LAST_DOTDOT
?
Well, search the source for = LAST_DOTDOT
, and we find that link_path_walk
is doing it.
And even better: bt
says that link_path_walk
is a caller, so it will be easy to understand what is going on now.
In link_path_walk
, we see:
if (name[0] == '.') switch (hashlen_len(hash_len)) {
case 2:
if (name[1] == '.') {
type = LAST_DOTDOT;
and thus the mistery is solved: ".."
was not the check that was being done, which foiled our previous greps!
Instead, the two dots were being checked separately (because .
is a subcase).