CUDA Zero Copy memory considerations

后端 未结 5 1656
清歌不尽
清歌不尽 2021-01-04 13:18

I am trying to figure out if using cudaHostAlloc (or cudaMallocHost?) is appropriate.

I am trying to run a kernel where my input data is more than the amount availab

5条回答
  •  执念已碎
    2021-01-04 14:10

    Using host memory would be orders of magnitude slower than on-device memory. It has both very high latency and very limited throughput. For example capacity of PCIe x16 is mere 8GB/s when bandwidth of device memory on GTX460 is 108GB/s

提交回复
热议问题