问题
Hi I'm a beginner in C and Linking, I was reading a book that has a question in linking with static library:
Let a and b denote object modules or static libraries in the current directory, and let a→b denote that a depends on b, in the sense that b defines a symbol that is referenced by a. For each of the following scenarios, show the minimal command line (i.e., one with the least number of object file and library arguments) that will allow the static linker to resolve all symbol references:
p.o → libx.a → liby.a and liby.a → libx.a →p.o
and the answer given by the book is:
gcc p.o libx.a liby.a libx.a
I'm confused, shouldn't the answer be :
gcc p.o libx.a liby.a libx.a p.o
otherwise how the undefined symbol in libx.a
resolved by p.o
?
回答1:
In case your C textbook does not make it clear, the linkage
behaviour that the author is attempting to illustrate with this
exercise is not mandated by the C Standard and is in fact behaviour
of the GNU binutils
linker ld
- the default system linker in Linux,
usually invoked on your behalf by gcc|g++|gfortran
, etc - and possibly
but not necessarily the behaviour of other linkers you might encounter.
If you've given us the exercise accurately, the author may be someone who does not understand static linking quite as well as would be best for writing textbooks about it, or perhaps just doesn't express themselves with great care.
Unless we are linking a program, the linker by default will not even insist on resolving all symbol references. So presumably we're linking a program (not a shared library), and if the answer:
gcc p.o libx.a liby.a libx.a
is actually what the text-book says, then a program is what it has to be.
But a program must have a main
function. Where is the main
function
and what are its linkage relationships to p.o
, libx.a
and liby.a
? This
matters and we're not told.
So let's assume that p
stands for program, and that the main function is at
least defined in p.o
. Weird though it would be for liby.a
to depend
on p.o
where p.o
is the main object module of the program, it would be even
weirder for the main
function to be defined in a member of a static library.
Assuming that much, here are some source files:
p.c
#include <stdio.h>
extern void x(void);
void p(void)
{
puts(__func__);
}
int main(void)
{
x();
return 0;
}
x.c
#include <stdio.h>
void x(void)
{
puts(__func__);
}
y.c
#include <stdio.h>
void y(void)
{
puts(__func__);
}
callx.c
extern void x(void);
void callx(void)
{
x();
}
cally.c
extern void y(void);
void cally(void)
{
y();
}
callp.c
extern void p(void);
void callp(void)
{
p();
}
Compile them all to object files:
$ gcc -Wall -Wextra -c p.c x.c y.c callx.c cally.c callp.c
And make static libraries libx.a
and liby.a
:
$ ar rcs libx.a x.o cally.o callp.o
$ ar rcs liby.a y.o callx.o
Now, p.o
, libx.a
and liby.a
fulfil the conditions of the exercise:
p.o → libx.a → liby.a and liby.a → libx.a →p.o
Because:
p.o
refers to but does not definex
, which is defined inlibx.a
.libx.a
definescally
, which refers to but does not definey
, which is defined inliby.a
liby.a
definescallx
, which refers to but does not definex
, which is defined inlibx.a
.libx.a
definescallp
, which refers to but does not definep
, which is defined inp.o
.
We can confirm with nm
:
$ nm p.o
0000000000000000 r __func__.2252
U _GLOBAL_OFFSET_TABLE_
0000000000000013 T main
0000000000000000 T p
U puts
U x
p.o
defines p
( = T p
) and references x
( = U x
)
$ nm libx.a
x.o:
0000000000000000 r __func__.2250
U _GLOBAL_OFFSET_TABLE_
U puts
0000000000000000 T x
cally.o:
0000000000000000 T cally
U _GLOBAL_OFFSET_TABLE_
U y
callp.o:
0000000000000000 T callp
U _GLOBAL_OFFSET_TABLE_
U p
libx.a
defines x
( = T x
) and references y
( = U y
) and
references p
( = U p
)
$ nm liby.a
y.o:
0000000000000000 r __func__.2250
U _GLOBAL_OFFSET_TABLE_
U puts
0000000000000000 T y
callx.o:
0000000000000000 T callx
U _GLOBAL_OFFSET_TABLE_
U x
liby.a
defines y
( = T y
) and references x
( = U x
)
Now the textbook's linkage certainly succeeds:
$ gcc p.o libx.a liby.a libx.a
$ ./a.out
x
But is it the shortest possible linkage? No. This is:
$ gcc p.o libx.a
$ ./a.out
x
Why? Lets rerun the linkage with diagnostics to show which of our object files were actually linked:
$ gcc p.o libx.a -Wl,-trace
/usr/bin/ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
p.o
(libx.a)x.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
They were:
p.o
(libx.a)x.o
p.o
was first linked into the program because an input .o
file is
always linked, unconditionally.
Then came libx.a
. Read
static-libaries
to understand how the linker handled it. After linking p.o
, it had
only one unresolved reference - the reference to x
. It inspected libx.a
looking for an
object file that defines x
. It found (libx.a)x.o
. It extracted x.o
from libx.a
and linked it, and then it was done.1
All of the dependency relationships involving liby.a
:-
(libx.a)cally.o
depends on(liby.a)y.o
(liby.a)callx.o
depends on(libx.a)x.o
are irrelevant to the linkage, because the linkage does not need any
of the object files in liby.a
.
Given what the author says is the right answer, we can reverse engineer the exercise that they were striving to state. This is it:
An object module
p.o
that definesmain
refers to a symbolx
that it does not define, andx
is defined in memberx.o
of a static librarylibxz.a
(libxz.a)x.o
refers to a symboly
that it does not define, andy
is defined in membery.o
of a static libraryliby.a
(liby.a)y.o
refers to a symbolz
that it does not define, andz
is defined in memberz.o
oflibxz.a
.(liby.a)y.o
refers to a symbolp
that it does not define, andp
is defined inp.o
What is the minimal linkage command using
p.o
,libxz.a
,liby.a
that will succeed?
New source files:
p.c
Stays as before.
x.c
#include <stdio.h>
extern void y();
void cally(void)
{
y();
}
void x(void)
{
puts(__func__);
}
y.c
#include <stdio.h>
extern void z(void);
extern void p(void);
void callz(void)
{
z();
}
void callp(void)
{
p();
}
void y(void)
{
puts(__func__);
}
z.c
#include <stdio.h>
void z(void)
{
puts(__func__);
}
New static libraries:
$ ar rcs libxz.a x.o z.o
$ ar rcs liby.a y.o
Now the linkage:
$ gcc p.o libxz.a
libxz.a(x.o): In function `cally':
x.c:(.text+0xa): undefined reference to `y'
collect2: error: ld returned 1 exit status
fails, as does:
$ gcc p.o libxz.a liby.a
liby.a(y.o): In function `callz':
y.c:(.text+0x5): undefined reference to `z'
collect2: error: ld returned 1 exit status
and:
$ gcc p.o liby.a libxz.a
libxz.a(x.o): In function `cally':
x.c:(.text+0xa): undefined reference to `y'
collect2: error: ld returned 1 exit status
and (your own pick):
$ gcc p.o liby.a libxz.a p.o
p.o: In function `p':
p.c:(.text+0x0): multiple definition of `p'
p.o:p.c:(.text+0x0): first defined here
p.o: In function `main':
p.c:(.text+0x13): multiple definition of `main'
p.o:p.c:(.text+0x13): first defined here
libxz.a(x.o): In function `cally':
x.c:(.text+0xa): undefined reference to `y'
collect2: error: ld returned 1 exit status
fails with both undefined-reference errors and multiple-definition errors.
But the textbook answer:
$ gcc p.o libxz.a liby.a libxz.a
$ ./a.out
x
is now right.
The author was attempting to describe a mutual dependency between two static libraries in the linkage of a program, but fumbled the fact that such a mutual dependency can only exist when the the linkage needs at least one object file from each library that refers to some symbol that is defined by an object file in the other library.
The lessons to be learned from the corrected exercise are:
An object file
foo.o
that appears in the linker inputs never needs to appear more than once, because it will be linked unconditionally, and when it is linked the definition that it provides of any symbols
will serve to resolve all references tos
that accrue for any other linker inputs. Iffoo.o
is input twice you can only get errors for multiple-definition ofs
.But where there is a mutual dependency between static libraries in the linkage it can be resolved by inputting one of the libraries twice. Because an object file is extracted from a static library and linked if and only if that object file is needed to define an unresolved symbol reference that the linker is seeking to define at the point when the library is input. So in the corrected example:
p.o
is input and unconditionally linked.x
becomes an unresolved reference.libxz.a
is input.- A definition of
x
is found in(libxz.a)x.o
. (libxz.a)x.o
is extracted and linked.x
is resolved.- But
(libxz.a)x.o
refers toy
. y
becomes an unresolved reference.liby.a
is input.- A definition of
y
is found in(liby.a)y.o
. (liby.a)y.o
is extracted and linked.y
is resolved.- But
(liby.a)y.o
refers toz
. z
becomes an unresolved reference.libxz.a
is input again.- A definition of
z
is found inlibxz.a(z.o)
libxz.a(z.o)
is extracted and linked.z
is resolved.
[1] As the
-trace
output shows, strictly speaking the linkage was not
done until all the boilerplate following (libx.a)x.o
was also linked,
but it's the same boilerplate for every C program linkage.
来源:https://stackoverflow.com/questions/53149260/linking-with-static-library-in-c