How to separate CUDA code into multiple files

后端未结

关注

 4  835

I am trying separate a CUDA program into two separate .cu files in effort to edge closer to writing a real app in C++. I have a simple little program that:

Allocate

相关标签:

4条回答

太阳男子

2020-12-13 08:05

Getting the separation is actually quite simple, please check out this answer for how to set it up. Then you simply put your host code in .cpp files and your device code in .cu files, the build rules tell Visual Studio how to link them together into the final executable.

The immediate problem in your code that you are defining the __global__ TestDevice function twice, once when you #include MyKernel.cu and once when you compile the MyKernel.cu independently.

You will need to put a wrapper into a .cu file too - at the moment you are calling TestDevice<<<>>> from your main function but when you move this into a .cpp file it will be compiled with cl.exe, which doesn't understand the <<<>>> syntax. Therefore you would simply call TestDeviceWrapper(griddim, blockdim, params) in the .cpp file and provide this function in your .cu file.

If you want an example, the SobolQRNG sample in the SDK achieves nice separation, although it still uses cutil and I would always recommend avoiding cutil.

0 讨论(0)
发布评论:

提交评论
- 加载中...

时光说笑

2020-12-13 08:13

The simple solution is to turn off building of your MyKernel.cu file.

Properties -> General -> Excluded from build

The better solution imo is to split your kernel into a cu and a cuh file, and include that, for example:

//kernel.cu
#include "kernel.cuh"
#include <cuda_runtime.h>

__global__ void increment_by_one_kernel(int* vals) {
  vals[threadIdx.x] += 1;
}

void increment_by_one(int* a) {
  int* a_d;

  cudaMalloc(&a_d, 1);
  cudaMemcpy(a_d, a, 1, cudaMemcpyHostToDevice);
  increment_by_one_kernel<<<1, 1>>>(a_d);
  cudaMemcpy(a, a_d, 1, cudaMemcpyDeviceToHost);

  cudaFree(a_d);
}

//kernel.cuh
#pragma once

void increment_by_one(int* a);

//main.cpp
#include "kernel.cuh"

int main() {
  int a[] = {1};

  increment_by_one(a);

  return 0;
}

0 讨论(0)

难免孤独

2020-12-13 08:14
You are including mykernel.cu in kernelsupport.cu, when you try to link the compiler sees mykernel.cu twice. You'll have to create a header defining TestDevice and include that instead.

re comment:

Something like this should work
```
// MyKernel.h
#ifndef mykernel_h
#define mykernel_h
__global__ void TestDevice(int* devicearray);
#endif
```
and then change the including file to
```
//KernelSupport.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_

#include <iostream>
#include <MyKernel.h>
// ...
```
re your edit

As long as the header you use in c++ code doesn't have any cuda specific stuff (__kernel__,__global__, etc) you should be fine linking c++ and cuda code.
0 讨论(0)
发布评论:

提交评论
- 加载中...

醉梦人生

2020-12-13 08:17

If you look at the CUDA SDK code examples, they have extern C defines that reference functions compiled from .cu files. This way, the .cu files are compiled by nvcc and only linked into the main program while the .cpp files are compiled normally.

For example, in marchingCubes_kernel.cu has the function body:

extern "C" void
launch_classifyVoxel( dim3 grid, dim3 threads, uint* voxelVerts, uint *voxelOccupied, uchar *volume,
                      uint3 gridSize, uint3 gridSizeShift, uint3 gridSizeMask, uint numVoxels,
                      float3 voxelSize, float isoValue)
{
    // calculate number of vertices need per voxel
    classifyVoxel<<<grid, threads>>>(voxelVerts, voxelOccupied, volume, 
                                     gridSize, gridSizeShift, gridSizeMask, 
                                     numVoxels, voxelSize, isoValue);
    cutilCheckMsg("classifyVoxel failed");
}

While in marchingCubes.cpp (where main() resides) just has a definition:

extern "C" void
launch_classifyVoxel( dim3 grid, dim3 threads, uint* voxelVerts, uint *voxelOccupied, uchar *volume,
                      uint3 gridSize, uint3 gridSizeShift, uint3 gridSizeMask, uint numVoxels,
                      float3 voxelSize, float isoValue);

You can put these in a .h file too.

0 讨论(0)