Find argc and argv from a library

感情迁移 提交于 2019-12-07 10:40:53

问题


How do I find a program's argc and argv from a shared object? I am writing a library in C that will be loaded via LD_PRELOAD. I've been able to find the stack two different ways:

  1. Read rsp via inline __asm__ call.
  2. Read /proc/<pid>/maps and parse the entry for stack.

I can then create a pointer, point it at the stack segment, then iterate through looking for data. The problem is I can't figure out an efficient way to determine what bytes are argc and the pointer to the pointer to the argv strings.

I know that /proc/<pid>/cmdline also contains the arguments, each separated by 0x00, but I'm interested in finding everything in memory.

In gdb I see a DWORD for argc followed by a QWORD which is the first pointer. 20 bytes before the address of argc is a pointer that points back into the main program's code segment. But that's not a deterministic way to identify argc and argv.

I've seen a few posts but no working code:

  • http://linux.derkeiler.com/Newsgroups/comp.os.linux.development.system/2005-07/0296.html
  • https://sourceware.org/ml/libc-help/2009-11/msg00010.html

回答1:


This response in your second link contains working source code which worked fine for me (Gnu/Linux elf-based system), including during LD_PRELOAD.

The code is very short; it consists of a function:

int foo(int argc, char **argv, char **env) {
   // Do something with argc, argv (and env, if desired)
}

and a pointer to that function in the .init_array section:

__attribute__((section(".init_array"))) static void *foo_constructor = &foo;

Putting that into a shared library and then LD_PRELOADing the shared library certainly triggered the call to foo when I tried it, and it was clearly called with the argc and argv which would later be passed to main (and also the value of environ).




回答2:


The most reliable is probably to use /proc/<pid>/cmdline because that's provided by the kernel and won't change depending on the C implementation (for example it would depend on the processor you're using).

The problem is that on some platforms the arguments to a function (fx main) would be passed on the stack, but on other platforms it might be passed as registers (fx on x86-64 platform). If it's sent via registers then if optimizations are enabled main will not store these in memory if it doesn't need to - that is it's likely not to remain in memory if you don't explicitly do so yourself.

Even if the arguments are passed on the stack the exact location where the arguments of main is located may differ from version to version of the compiler/implementation. Which means there's hardly any reliable method of retrieving them from the stack (and as someone pointed out they may be modified during execution of main as part of command line parsing).

Even the way the kernel passes the arguments to the program doesn't help much as they are passed via registers - which means that where they're going to be stored is entirely up to the CRT init (which in turn may change from version to version).

In short retrieving argv and argc later on requires explicit support from the CRT you're using (Microsoft's CRT does that, but GNU doesnt AFAIK).

What you could do of course is to grab the source of GCC and patch the CRT init to actually store the argv and argc somewhere where you can later retrieve them. That would of course not work if you need to access them before CRT init of the program is being run (fx during dynamic linking).




回答3:


This is a bad idea, but I'm not naive enough to say you don't have a valid reason for it.

There's no good way to find argc/argv if all you know is the location of the stack. Luckily, envp is directly after argv on the stack, and every libc that I know of puts envp in the __environ global. So by going backwards from __environ, you can find argc and argv. Here's some example code written in Rust, which should be pretty easy to port to C++:

extern "C" {
    pub static __environ: *const *const c_char;
}

fn raw_args() -> (c_int, *const *const c_char) {
    let mut walk_environ = unsafe { __environ as *const usize };
    walk_environ = walk_environ.wrapping_offset(-1);
    let mut i = 0;

    loop {
        let argc_ptr = walk_environ.wrapping_offset(-1) as *const c_int;
        let argc = unsafe { *argc_ptr };
        if argc == i {
            break (argc, walk_environ as *const *const c_char);
        }
        walk_environ = walk_environ.wrapping_offset(-1);
        i += 1;
    }
}


来源:https://stackoverflow.com/questions/34915875/find-argc-and-argv-from-a-library

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!