问题
My goal is: library2.so
is using library1.so
and mycode.o
is using (libs should be linked) library2.so
(and maybe library1.so
).
The source code is (one line header files are omitted):
library1.cu:
__device__ void func1_lib1(void){}
library2.cu:
#include "library1.h"
__global__ void func1_lib2(void)
{
func1_lib1();
}
extern "C"
void func2_lib2(void)
{
func1_lib2<<<1,1>>>();
}
mycode.c:
#include "library2.h"
int main(void)
{
func2_lib2();
}
I'm building the shared libraries according to with Makefile
broken:
rm -f *.o *.so
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library1.cu
nvcc -arch=sm_30 --compiler-options '-fPIC' -dlink -o cuda-lib1.o library1.o
gcc -shared -Wl,-soname,library1.so -o library1.so library1.o uda-lib1.o
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library2.cu
nvcc -arch=sm_30 --compiler-options '-fPIC' -dlink -o cuda-lib2.o library2.o -lrary1
gcc -shared -Wl,-soname,library2.so -o library2.so library2.o cuda-lib2.o
gcc -c mycode.c
gcc -o mycode -L. -lrary2 -lrary1 mycode.o
working:
rm -f *.o *.so
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library1.cu
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library2.cu
nvcc -arch=sm_30 --compiler-options '-fPIC' -dlink -o cuda-lib.o library1.o library2.o
gcc -shared -Wl,-soname,library.so -o library.so library1.o library2.o cuda-lib.o
gcc -c -fPIC mycode.c
gcc -o mycode -L. -lrary -L/usr/local/cuda/lib64 -lcuda -lcudart mycode.o
make working
works without any problems. But it doesn't make a chain of libraries. library1.cu
and library2.cu
are in the same .so
file.
make broken
fails with
nvcc -arch=sm_30 --compiler-options '-fPIC' -dlink -o cuda-lib2.o library2.o -lrary1
nvlink error : Undefined reference to '_Z10func1_lib1v' in 'library2.o'
If I inspect library1.so
by nm
there is inside a target (T) _Z10func1_lib1v
.
回答1:
In your "broken" approach, you are attempting to create a library1.so
(a shared library) which contains only a __device__
function:
__device__ void func1_lib1(void){}
Any other object that wished to use that __device__
function must make use of relocatable device code/separate compilation and linking, which of course you are trying to do.
However, with respect to libraries, device linking only supports functions contained in static libraries. Note these statements from the nvcc
manual:
The device linker has the ability to read the static host library formats (.a on Linux and Mac OS X, .lib on Windows). It ignores any dynamic (.so or .dll) libraries.
and:
Note that only static libraries are supported by the device linker.
So your general strategy won't work. A possible workaround would be to place the library1.cu code in a static library:
rm -f *.o *.so
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library1.cu
nvcc -arch=sm_30 --lib -o cuda-lib1.a library1.o
nvcc -arch=sm_30 --compiler-options '-fPIC' -dc library2.cu
nvcc -arch=sm_30 --compiler-options '-fPIC' -dlink -o cuda-lib2.o library2.o cuda-lib1.a
gcc -shared -Wl,-soname,library2.so -o library2.so -L/usr/local/cuda/lib64 -lcuda -lcudart library2.o cuda-lib2.o cuda-lib1.a
gcc -c mycode.c
gcc -o mycode -L. -lrary2 mycode.o
Or else to create a sequence of .so
libraries that don't require device linking across library boundaries, which is more-or-less demonstrated in your "working" case.
来源:https://stackoverflow.com/questions/35897002/cuda-nvcc-building-chain-of-libraries