I never clearly understood what an ABI is. Please don\'t point me to a Wikipedia article. If I could understand it, I wouldn\'t be here posting such a lengthy post.
Linux shared library minimal runnable ABI example
In the context of shared libraries, the most important implication of "having a stable ABI" is that you don't need to recompile your programs after the library changes.
So for example:
if you are selling a shared library, you save your users the annoyance of recompiling everything that depends on your library for every new release
if you are selling closed source program that depends on a shared library present in the user's distribution, you could release and test less prebuilts if you are certain that ABI is stable across certain versions of the target OS.
This is specially important in the case of the C standard library, which many many programs in your system link to.
Now I want to provide a minimal concrete runnable example of this.
main.c
#include
#include
#include "mylib.h"
int main(void) {
mylib_mystruct *myobject = mylib_init(1);
assert(myobject->old_field == 1);
free(myobject);
return EXIT_SUCCESS;
}
mylib.c
#include
#include "mylib.h"
mylib_mystruct* mylib_init(int old_field) {
mylib_mystruct *myobject;
myobject = malloc(sizeof(mylib_mystruct));
myobject->old_field = old_field;
return myobject;
}
mylib.h
#ifndef MYLIB_H
#define MYLIB_H
typedef struct {
int old_field;
} mylib_mystruct;
mylib_mystruct* mylib_init(int old_field);
#endif
Compiles and runs fine with:
cc='gcc -pedantic-errors -std=c89 -Wall -Wextra'
$cc -fPIC -c -o mylib.o mylib.c
$cc -L . -shared -o libmylib.so mylib.o
$cc -L . -o main.out main.c -lmylib
LD_LIBRARY_PATH=. ./main.out
Now, suppose that for v2 of the library, we want to add a new field to mylib_mystruct
called new_field
.
If we added the field before old_field
as in:
typedef struct {
int new_field;
int old_field;
} mylib_mystruct;
and rebuilt the library but not main.out
, then the assert fails!
This is because the line:
myobject->old_field == 1
had generated assembly that is trying to access the very first int
of the struct, which is now new_field
instead of the expected old_field
.
Therefore this change broke the ABI.
If, however, we add new_field
after old_field
:
typedef struct {
int old_field;
int new_field;
} mylib_mystruct;
then the old generated assembly still accesses the first int
of the struct, and the program still works, because we kept the ABI stable.
Here is a fully automated version of this example on GitHub.
Another way to keep this ABI stable would have been to treat mylib_mystruct
as an opaque struct, and only access its fields through method helpers. This makes it easier to keep the ABI stable, but would incur a performance overhead as we'd do more function calls.
API vs ABI
In the previous example, it is interesting to note that adding the new_field
before old_field
, only broke the ABI, but not the API.
What this means, is that if we had recompiled our main.c
program against the library, it would have worked regardless.
We would also have broken the API however if we had changed for example the function signature:
mylib_mystruct* mylib_init(int old_field, int new_field);
since in that case, main.c
would stop compiling altogether.
Semantic API vs Programming API
We can also classify API changes in a third type: semantic changes.
The semantic API, is usually a natural language description of what the API is supposed to do, usually included in the API documentation.
It is therefore possible to break the semantic API without breaking the program build itself.
For example, if we had modified
myobject->old_field = old_field;
to:
myobject->old_field = old_field + 1;
then this would have broken neither programming API, nor ABI, but main.c
the semantic API would break.
There are two ways to programmatically check the contract API:
formal verification. Harder to do, but produces mathematical proof of correctness, essentially unifying documentation and tests into a "human" / machine verifiable manner! As long as there isn't a bug in your formal description of course ;-)
This concept is closely related to the formalization of Mathematics itself: https://math.stackexchange.com/questions/53969/what-does-formal-mean/3297537#3297537
List of everything that breaks C / C++ shared library ABIs
TODO: find / create the ultimate list:
Java minimal runnable example
What is binary compatibility in Java?
Tested in Ubuntu 18.10, GCC 8.2.0.