问题
I have to build my program on CentOS 7 and deploy on other Linux machine. The program required newer version glibc
, and some library which was not (and will not be) installed on target machine. So I decided to ship the executable with dynamic library. I used patchelf
to patch interpreter
and rpath
.
I tested the executable on my machine and it work (also checked with ldd
to make sure new rpath is used). But when I copy to other machine with libs, the program is failed to run. Only this line was printed:
Illegal instruction
Here is backtrace from gdb
Update:
Binary
So the SIGILL was caused by shlx
instruction in __tls_init()
function. I don't know which library provide this function, I'm not sure it is from glibc.
I removed my glibc, which coppied from another computer and use glibc already installed on target computer, but the problem was not fixed.
回答1:
I used patchelf to patch interpreter and rpath
Your question is very unclear: you changed the interpreter and the rpath to what?
I think what you did is:
- Build a new GLIBC version in non-standard path
- Used
patchelf
to change your binary to point to the non-standard path - Copied the binary and the non-standard GLIBC to the target machine
- Observed
SIGILL
.
Most likely cause: the non-standard GLIBC you built is not configured for your target processor, which is different from the processor used on the build machine.
By default, GCC will use -march=native
, which means that if you build on e.g. Haswell machine, then the binary will use AVX2
instructions, which are not supported by the target machine.
To fix this, you will need to add -march=generic
or -march=$target_architecture
to CFLAGS
(and CXXFLAGS
), and rebuild both GLIBC and the main program.
On the other hand, your GDB backtrace shows standard paths to GLIBC: /lib64/ld-linux-x86-64.so.2
and /lib64/libc.so.6
, so maybe I didn't understand the steps you made at all.
Update:
I didn't build a new glibc but copy it from my machine to the target machine. My machine using E5-2690v4 but the target machine using E5-2470.
The E5-2690v4 is a Broadwell. The E5-2470 is an Ivy Bridge.
The former supports AVX2, but the latter doesn't. Copying GLIBC built with AVX2 to an Ivy Bridge is likely to fail with exactly the symptoms you described (and in fact should render the Ivy Bridge completely non-working; I am surprised anything works on it at all).
Using GDB x/i $pc
command, you can see which instruction generates SIGILL
. If it's an AVX2 instruction, that's likely the answer.
来源:https://stackoverflow.com/questions/50945287/illegal-instruction-when-run-precompiled-program-on-other-machine