CUDA Dynamic Parallelism MakeFile

后端未结

关注

 1  880

This is my first program using Dynamic Parallelism and I am unable to compile the code. I need to be able to run this for my research project at college and any help will be

相关标签:

1条回答

天涯浪人

2020-12-10 19:47

When using the host linker (g++) for final linking of your executable, and when using relocatable device code (nvcc -dc), it's necessary to do an intermediate device code link step.

From the documentation:

If you want to invoke the device and host linker separately, you can do:

nvcc –arch=sm_20 –dc a.cu b.cu
nvcc –arch=sm_20 –dlink a.o b.o –o link.o
g++ a.o b.o link.o –L<path> -lcudart

Since you are specifying -dc on the compile line, you are getting a compile-only operation (just as if you had specified -c to g++).

Here's a modified/condensed Makefile that should show you what is involved:

GENCODE_SM35     := -gencode arch=compute_35,code=sm_35
GENCODE_FLAGS    := $(GENCODE_SM35)

LDFLAGS   := -L/usr/local/cuda/lib64 -lcudart -lcudadevrt
CCFLAGS   := -m64

NVCCFLAGS := -m64 -dc

NVCC := nvcc
GCC := g++

# Debug build flags
ifeq ($(dbg),1)
      CCFLAGS   += -g
      NVCCFLAGS += -g -G
      TARGET := debug
else
      TARGET := release
endif


# Common includes and paths for CUDA
INCLUDES      := -I/usr/local/cuda/include -I. -I..

# Additional parameters
MAXRREGCOUNT  :=  -po maxrregcount=16

# Target rules
all: build

build: BlackScholes

BlackScholes.o: BlackScholes.cu
        $(NVCC) $(NVCCFLAGS) $(EXTRA_NVCCFLAGS) $(GENCODE_FLAGS) $(MAXRREGCOUNT) $(INCLUDES) -o $@ $<
        $(NVCC) -dlink  $(GENCODE_FLAGS) $(MAXRREGCOUNT)  -o bs_link.o $@

BlackScholes_gold.o: BlackScholes_gold.cpp
        $(GCC) $(CCFLAGS) $(INCLUDES) -o $@ -c $<

BlackScholes: BlackScholes.o BlackScholes_gold.o bs_link.o
        $(GCC) $(CCFLAGS) -o $@ $+ $(LDFLAGS) $(EXTRA_LDFLAGS)

run: build
        ./BlackScholes

0 讨论(0)