If both of them contain compiled code, why can\'t we load the \"static\" files at runtime and why can\'t we link with the dynamic libraries at compile time? Why is there a need
There's no fundamental reason you couldn't use static library (.a
) files as dynamic libraries, loading and linking them at program start time or even dynamically at runtime. However, they have not been specially compiled and prepared to run memory-mapped at an arbitrary address (not even the alignment is necessarily correct!), so the loader code would have to allocate memory, read the necessary objects from the .a
files into this memory, and perform major modifications on it. And of course none of this memory would be shareable. Thus, it's probably a very bad idea...
A static library is just a tarball1 of .o or
.obj` files. When an executable is linked the referenced modules (and the ones that they reference, and then the ones that they reference, and then ... etc) are copied out of the static lib and tacked onto the end of the main program. This whole thing is paged into memory as a single OS "object".
A dynamic library just takes all the elements of the static library, links them together (resolving intramural relationships) and then maps the whole thing into memory. (Demand paging may make a partial memory presence attainable.) A certain amount of fiddling is necessary when launching dynamic programs in order to hook up the main program (which will be "statically" linked within its own modules) to the library that is shared system-wide. Sometimes this fiddling is delayed per-linkage-element until a given call is made. In a very broad, overly-conceptual sweep one could categorize static linking as eager loading and dynamic as lazy loading.
There are pluses and minuses with static libraries.
✚ No DLL hell (aka dependency hell)
✚ Much smaller memory footprint for small run-time process mixtures
− Much larger memory footprint for large run-time process mixtures of disparate programs
− Can't share any library code memory between processes except when they are running the same program
− A large set of programs (like Linux/Windows/Mac footprints) take up a lot of space as printf et al are duplicated over and over in each image
− It's difficult if not impossible to fix security bugs originating in libraries
− It's difficult if not impossible to update a library alone
✚ It's difficult if not impossible to update a library alone and break your program
1. Actually, they aren't in tar(1) format, but it's related.
Static libraries are true libraries in the sense that they contain a library of object files. Each object file is often created from a single source file and contains machine code as well as information about data required for the code. During the link step the linker will pick the necessary object files and combine them into an executable.
One important part of machine code is that jumps, calls and data pointers will have to contain real memory addresses. However, if an object file needs to call another function in another object file it can only refer to that function using a symbol. When the linker combines the object files into executable code the symbol references are resolved and turned into real memory addresses.
A dynamic library is executable code that can be loaded into the memory and executed straight away. On some operating systems there may be an additional step where the code is rebased by moving the executable code to another location and this requires all absolute addresses within the code to be shifted by a fixed amount. This operation is still much faster than combining object files and resolving symbols done by the linker.
To sum it up:
If you've ever tried to link a reasonably sized project you will have noticed that it takes a non-trivial amount of time, probably longer than you would like to wait to start an application. That sort of explains why you can't execute static libraries. And dynamic libraries have been optimized and stripped to not contain anything except executable code which makes them unsuitable for use as static libraries.
The code in an object file isn't linked. It contains implicit references to external functions that have not yet been resolved.
When object files are linked to create a DLL, the linker looks through all those external references and finds other libraries (static or dynamic) that can satisfy them. A reference to a name in a static library is resolved by including the body of that function (or whatever) into the DLL. If it refers to a dynamic library, the name of both the DLL and the referenced function are included in the DLL.
Ultimately, there's no reason this would have to be the case. In theory, you could write the loader to do all of this every time you loaded a file. It's basically just an optimization: the linker does the relatively slow parts of the job. References to DLLs are left, but they're resolved to the point that it's fairly fast for the loader to find and load the target file (if necessary) and resolve the referenced functions. When the linker is doing its job, it does a lot more by scanning through long lists of definitions to find the ones you care about, which is quite a bit slower.
Note: The following answer is not platform agnostic, but specific to ELF-based systems and some other similar ones. Someone else can fill in details for other systems.
What is a static library?
A static library is a collection of *.o
files in an archive. Each file can contain references to undefined symbols which must be resolved by the linker, for example, your library might have a reference to printf
. The library doesn't provide any indication about where printf
will be found, it's expected that the linker will find it in one of the other libraries it's asked to link in.
Suppose your library contains the following code:
read_png.o
write_png.o
read_jpg.o
write_jpg.o
resize_image.o
handle_error.o
If an application only uses read_png
and write_png
, then the other pieces of code won't get loaded into the executable (except handle_error
, which is called from read_png
and write_png
).
We can't load a static library at runtime because:
The linker doesn't know where to find external objects, e.g., printf
.
It would be slow. Dynamic libraries are optimized for fast loading.
Static libraries have no concept of namespaces. I can't define my own handle_error
because it would conflict with the library's definition.
What is a dynamic library?
A dynamic library, on ELF systems, is the same type of object as an executable. It also exports more symbols, an executable only needs to export _start
. Dynamic libraries are optimized so the whole thing can be mapped directly into memory.
If you have a call to printf
in your dynamic library, there are some additional requirements beyond the requirements for static libraries:
You have to specify which library has printf
.
You have to call the function in a special way that lets the linker insert the address for printf
. In a static library, the linker can just modify your code and insert the address directly, but this is not possible with shared libraries.
We don't want to use dynamic libraries to link statically because:
We can't link in only part of a dynamic library. Even if our executable never calls read_jpg
, it gets included because dynamic libraries are all-or-nothing.
The extra overhead for function calls is wasteful, even if it is small.
Summary
Compilation looks something like this:
Source ==compile==> Object ==link==> Executable / Shared Library
A static library is an archive full of objects that haven't been linked yet. There's a lot of work left to be done.
A shared library is a linked final product, ready to be loaded into memory.
Static libraries were invented first. If both were invented at the same time, it's possible they would be much more similar.
The difference is:
References