Segmentation fault when run as root?

[亡魂溺海] 提交于 2019-12-09 07:36:21

You might want to run your program under valgrind. I wrote a tiny program that writes outside of an allocated array:

$ valgrind ./segfault
==11830== Memcheck, a memory error detector
==11830== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==11830== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==11830== Command: ./segfault
==11830== 
==11830== Invalid write of size 1
==11830==    at 0x4004BF: main (in /tmp/segfault)
==11830==  Address 0x7feff65bf is not stack'd, malloc'd or (recently) free'd
==11830== 
==11830== 
==11830== Process terminating with default action of signal 11 (SIGSEGV)
==11830==  Access not within mapped region at address 0x7FEFF65BF
==11830==    at 0x4004BF: main (in /tmp/segfault)
==11830==  If you believe this happened as a result of a stack
==11830==  overflow in your program's main thread (unlikely but
==11830==  possible), you can try to increase the size of the
==11830==  main thread stack using the --main-stacksize= flag.
==11830==  The main thread stack size used in this run was 8388608.
==11830== 
==11830== HEAP SUMMARY:
==11830==     in use at exit: 0 bytes in 0 blocks
==11830==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==11830== 
==11830== All heap blocks were freed -- no leaks are possible
==11830== 
==11830== For counts of detected and suppressed errors, rerun with: -v
==11830== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
Segmentation fault

The most important part of this output is here:

==11830== Invalid write of size 1
==11830==    at 0x4004BF: main (in /tmp/segfault)

The write of size 1 might help you figure out which line was involved:

int main(int argc, char *argv[]) {
    char f[1];
    f[-40000]='c';
    return 0;
}

Another very useful tool to know is gdb. If you set your rlimits to allow dumping core (see setrlimit(2) for details on the limits, and your shell's manual (probably bash(1)) for details on the ulimit built-in command) then you can get a core file for use with gdb:

$ ulimit -c 1000
$ ./segfault 
Segmentation fault (core dumped)
$ gdb --core=core ./segfault
GNU gdb (GDB) 7.2-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/segfault...(no debugging symbols found)...done.
[New Thread 11951]

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6...Reading symbols from /usr/lib/debug/lib/libc-2.12.1.so...done.
done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.12.1.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./segfault'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004004bf in main ()
(gdb) bt
#0  0x00000000004004bf in main ()
(gdb) quit

Depending upon the size of your program, you might need to give way more than 1000 blocks to the allowed core file. If this program were remotely complicated, knowing the call chain to get to the segfault could be vital information.

The problem is that you've invoked undefined behavior somewhere. Undefined behavior can behave differently on different machines, different runs on the same machine, whatever. You've got to find where you let a wild pointer happen and deal with it.

Most likely you're just getting "lucky" when running as a limited user and either the page permissions on your process are set to allow whatever invalid memory access you're getting, or you have some root-specific code which isn't being reached when run in usermode only.

It's hard to say anything specific without seeing any code, so I'll give you some general advice: learn to use your debugger (probably gdb), and try to reproduce the failure under the debugger. If you're lucky, the segfault will still occur under the debugger, you'll get a stack trace showing where it failed, and that will give you a starting point that will let you work your way back to the true source of the problem.

If you're unlucky, the problem might disappear if you compile with debugging support, or run it under gdb. In that case you'll have to resort to code inspection, and scrub your code for any undefined behavior (for example, wild or uninitialized pointers, as Billy ONeal suggests).

doron

Set ulimit -c unlimited.

Run your program and let it crash. It should now core dump.

Run gdb <program-name> core

If you use the bt (backtrace) command, it should give you a good idea where the crash is happening. This should then help you fix it.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!