问题
How do I find a program's argc
and argv
from a shared object? I am writing a library in C that will be loaded via LD_PRELOAD
. I've been able to find the stack two different ways:
- Read
rsp
via inline__asm__
call. - Read
/proc/<pid>/maps
and parse the entry for stack.
I can then create a pointer, point it at the stack segment, then iterate through looking for data. The problem is I can't figure out an efficient way to determine what bytes are argc
and the pointer to the pointer to the argv
strings.
I know that /proc/<pid>/cmdline
also contains the arguments, each separated by 0x00
, but I'm interested in finding everything in memory.
In gdb I see a DWORD
for argc
followed by a QWORD
which is the first pointer. 20 bytes before the address of argc
is a pointer that points back into the main program's code segment. But that's not a deterministic way to identify argc
and argv
.
I've seen a few posts but no working code:
- http://linux.derkeiler.com/Newsgroups/comp.os.linux.development.system/2005-07/0296.html
- https://sourceware.org/ml/libc-help/2009-11/msg00010.html
回答1:
This response in your second link contains working source code which worked fine for me (Gnu/Linux elf-based system), including during LD_PRELOAD
.
The code is very short; it consists of a function:
int foo(int argc, char **argv, char **env) {
// Do something with argc, argv (and env, if desired)
}
and a pointer to that function in the .init_array
section:
__attribute__((section(".init_array"))) static void *foo_constructor = &foo;
Putting that into a shared library and then LD_PRELOADing the shared library certainly triggered the call to foo
when I tried it, and it was clearly called with the argc
and argv
which would later be passed to main
(and also the value of environ
).
回答2:
The most reliable is probably to use /proc/<pid>/cmdline
because that's provided by the kernel and won't change depending on the C implementation (for example it would depend on the processor you're using).
The problem is that on some platforms the arguments to a function (fx main
) would be passed on the stack, but on other platforms it might be passed as registers (fx on x86-64 platform). If it's sent via registers then if optimizations are enabled main
will not store these in memory if it doesn't need to - that is it's likely not to remain in memory if you don't explicitly do so yourself.
Even if the arguments are passed on the stack the exact location where the arguments of main
is located may differ from version to version of the compiler/implementation. Which means there's hardly any reliable method of retrieving them from the stack (and as someone pointed out they may be modified during execution of main
as part of command line parsing).
Even the way the kernel passes the arguments to the program doesn't help much as they are passed via registers - which means that where they're going to be stored is entirely up to the CRT init (which in turn may change from version to version).
In short retrieving argv
and argc
later on requires explicit support from the CRT you're using (Microsoft's CRT does that, but GNU doesnt AFAIK).
What you could do of course is to grab the source of GCC and patch the CRT init to actually store the argv
and argc
somewhere where you can later retrieve them. That would of course not work if you need to access them before CRT init of the program is being run (fx during dynamic linking).
回答3:
This is a bad idea, but I'm not naive enough to say you don't have a valid reason for it.
There's no good way to find argc/argv if all you know is the location of the stack. Luckily, envp
is directly after argv
on the stack, and every libc that I know of puts envp
in the __environ
global. So by going backwards from __environ
, you can find argc and argv. Here's some example code written in Rust, which should be pretty easy to port to C++:
extern "C" {
pub static __environ: *const *const c_char;
}
fn raw_args() -> (c_int, *const *const c_char) {
let mut walk_environ = unsafe { __environ as *const usize };
walk_environ = walk_environ.wrapping_offset(-1);
let mut i = 0;
loop {
let argc_ptr = walk_environ.wrapping_offset(-1) as *const c_int;
let argc = unsafe { *argc_ptr };
if argc == i {
break (argc, walk_environ as *const *const c_char);
}
walk_environ = walk_environ.wrapping_offset(-1);
i += 1;
}
}
来源:https://stackoverflow.com/questions/34915875/find-argc-and-argv-from-a-library