When do .so files get loaded Linux?

自古美人都是妖i 提交于 2019-12-11 02:35:29

问题


I have a shared object (a.so) which is linked to my executable myexe. a.so exposed a method called get_val(), which myexe is using.

Now when a.so will be loaded into myexe's process address space? is it when myexe calls get_val() API, or when myexe gets launched.


回答1:


There are two (three) types of libraries:

  • static libraries (suffix: .a / .lib), which itself becomes part of the binary. Strictly speaking, it's not the whole library, it's those objects from the library which are required to satisfy unresolved links.
  • shared (dynamic) libraries (suffix: .so / .dll), which come in two flavors, distinguished by the time the library is loaded:
    • dynamic link library, which are libraries of which you told the compiler and linker and which you call like static libraries but which are not part of your library - they are loaded by the loader/linker (in Linux usually as part of __main() from libc by calling dlopen()).
    • dynamic load library for which you call dlopen() yourself.

(The terms seem a bit fuzzy, I've seen different literature using different terms; the terms above is how I memorized it in order to remember the concepts.)

So, if you use a.so without calling dlopen() yourself, a.so is a dynamic link library, so it is loaded at program start. In that case, removing a.so from the system will prevent your program from starting - it will be loaded, but it will fail before main() gets called.

If you use a.so with calling dlopen() yourself, it's completely under your control.

On your questions

Q1: If you call dlopen() yourself, with RTLD_LAZY, the a.so will be loaded when the first unresolved call that can be resolved by a.so is made. If you call dlopen() yourself, with RTLD_NOW, a.so is loaded immediately, i.e. before dlopen() returns. If you do not call dlopen() yourself but let libc do the work for you, a.so will be loaded at program start.

Q2: You delete a.so. If you call dlopen() with RTLD_LAZY, and do not run through the code that needs a.so, the program will run happily, otherwise a signal will be raised. If you do not call dlopen() but let libc do the work for you, the program will not start successfully.

Q3: Effectively, there is no way to load a.so without calling dlopen() (or something equivalent that would replace it). The question is just, do you call dlopen() yourself, or do you let the environment (i.e. libc) do the work for you.

Disclaimer: I'm not an expert for this stuff, and some parts of my answer may be wrong. I'll verify those parts of my answer on which I have doubts myself, i.e. whether it is libc that calls dlopen() or something else, and whether or not it's possible to have lazy binding even if you're not using dlopen() yourself. I'll update the answer once I have the results.




回答2:


I guess you are on Linux/x86-64. It is OS specific.

Generally, the ELF shared library is loaded at start of execution time, by ld-linux.so(8). Practically, the shared library should be position independent code (PIC).

But it could depend, and there is dlopen(3) with its flag RTLD_NOW or RTLD_LAZY

Read Drepper's paper: How To Write Shared Libraries and the x86-64 ABI specification

You could use strace(1) to find out what is happening on your own Linux system.

You could in principle dynamically load a foo.so by using mmap(2) and processing by yourself the relocations. I've done (nearly) that in the previous century (for SPARC), and believe me it is a tedious task.

BTW, dlopen is implemented in GNU libc and in musl-libc. Both are free software, you could study their source code.

Read also the Program Library HowTo. It explains some details, briefly speaking:

  • compile the source files of the shared object as PIC using

     gcc -Wall -fPIC -O src1.c -o src1.pic.o
     gcc -Wall -fPIC -O src2.c -o src2.pic.o
    
  • link them in a shared library foo.so using

     gcc -shared src1.pic.o src2.pic.o -o foo.so
    
  • use a full path to dlopen, e.g

         void* dlh = dlopen("./foo.so", RTLD_NOW);
         if (!dlh) { fprintf(stderr, "dlopen failed: %s\n", dlerror());
                 exit(EXIT_FAILURE);
    

Then you could have a convention saying that your foo.so plugin should have a function of signature

  typedef int sayhello_sig_t(const char*);

that is named say_hello and you get its address using:

  sayhello_sig_t* funptr = dlsym(dlh, "say_hello");
  if (!funptr) { 
     fprintf(stderr, 
             "dlsym say_hello failure: %s\n, dlerror();
     exit(EXIT_FAILURE);
  }


来源:https://stackoverflow.com/questions/29285546/when-do-so-files-get-loaded-linux

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!