Compile Python code to statically linked executable with Cython

前端 未结 1 1519
耶瑟儿~
耶瑟儿~ 2020-11-30 14:08

I have a pure Python script that I would like to distribute to systems with unkown Python configuration. Therefore, I would like to compile the Python code to a stand-alone

相关标签:
1条回答
  • 2020-11-30 14:38

    The experienced problems are obviously from the linker (gcc started a linker under the hood, to see it - just start gcc with -v - in verbose mode). So let's start with a short reminder how the linkage process works:

    The linker keeps the names of all symbols it needs to resolve. In the beginning it is only the symbol main. What happens, when linker inspects a library?

    1. If it is a static library, the linker looks at every object file in this library, and if this object files defines some looked for symbols, the whole object file is included (which means some symbols becomes resolved, but some further new unresolved symbols can be added). Linker might need to pass multiple times over a static library.

    2. If it is a shared library, it is viewed by the linker as a library consisting out of a single huge object file (after all, we have to load this library at the run time and don't have to pass multiple times over and over to prune unused symbols): If there is at least one needed symbol the whole library is "linked" (not really the linkage happens at the run-time, this is a kind of a dry-run), if not - the whole library is discarded and never looked again at.

    For example if you link with:

    gcc -L/path -lpython3.x <other libs> foo.o 
    

    you will get a problem, no matter whether python3.x is a shared or a static lib: when the linker sees it, it looks only for the symbol main, but this symbol is not defined in the python-lib, so it the python-lib is discarded and never looked again at. Only when the linker sees the object-file foo.o, it realizes, that the whole Python-Symbols are needed, but now it is already too late.

    There is a simple rule to handle this problem: put the object files first! That means:

    gcc -L/path  foo.o -lpython3.x <other libs> 
    

    Now the linker knows what it needs from the python-lib, when it first sees it.

    There are other ways to achieve a similar result.

    A) Let the linker to reiterate a group of archives as long as at least one new symbol definition was added per sweep:

    gcc -L/path --Wl,-start-group -lpython3.x <other libs> foo.o -Wl,-end-group
    

    Linker-options -Wl,-start-group and -Wl,-end-group says to linker iterate more than once over this group of archives, so the linker has a second chance (or more) to include symbols. This option can lead to longer linkage time.

    B) Switching on the option --no-as-needed will lead to a shared library (and only shared library) being linked in, no matter whether in this library defined symbols are needed or not.

    gcc -L/path -Wl,-no-as-needed -lpython3.x -Wl,-as-needed <other libs> foo.o
    

    Actually, the default ld-behavior is --no-as-needed, but the gcc-frontend calls ld with option --as-needed, so we can restore the behavior by adding -no-as-needed prior to the python-library and then switch it off again.


    Now to your problem of statical linking. I don't think it is advisable to use static versions of all standard libraries (all above glibc), what you should probably do is to link only the python-library statically.

    The rules of the linkage are simple: per default the linker tries to open a shared version of the library first and than the static version. I.e. for the library libmylib and paths A and B, i.e.

     -L/A -L/B lmylib
    

    it tries to open libraries in the following order:

    A/libmylib.so
    A/libmylib.a
    B/libmylib.so
    B/libmylib.a
    

    Thus if the folder A has only a static version, so this static version is used (no matter whether there is a shared version in folder B).

    Because it is quite opaque which library is really used - it depends on the setup of your system, usually one would switch on the logging of the linker via -Wl,-verbose to trouble-shoot.

    By using the option -Bstatic one can enforce the usage of the static version of a library:

    gcc  foo.o -L/path -Wl,-Bstatic -lpython3.x -Wl,-Bdynamic <other libs>  -Wl,-verbose -o foo
    

    Notable thing:

    1. foo.o is linked before the libraries.
    2. switch the static-mode off, directly after the python-library, so other libraries are linked dynamically.

    And now:

     gcc <cflags> L/paths foo.c -Wl,-Bstatic -lpython3.X -Wl,-Bdynamic <other libs> -o foo -Wl,-verbose
    ...
    attempt to open path/libpython3.6m.a succeeded
    ...
    ldd foo shows no dependency on python-lib
    ./foo
    It works!
    

    And yes, if you link against static glibc (I don't recommend), you will need to delete -Xlinker -export-dynamic from the command line.

    The executable compiled without -Xlinker -export-dynamic will not be able to load some of c-extension which depend on this property of the executable to which they are loaded with ldopen.

    0 讨论(0)
提交回复
热议问题