Why is my pcl cuda code running in CPU instead of GPU?

人走茶凉 提交于 2019-12-13 03:44:01

问题


I have a code where I use the pcl/gpu namespace:

pcl::gpu::Octree::PointCloud clusterCloud;
clusterCloud.upload(cloud_filtered->points);

pcl::gpu::Octree::Ptr octree_device (new pcl::gpu::Octree);
octree_device->setCloud(clusterCloud);
octree_device->build();

/*tree->setCloud (clusterCloud);*/

// Create the cluster extractor object for the planar model and set all the parameters
std::vector<pcl::PointIndices> cluster_indices;
pcl::gpu::EuclideanClusterExtraction ec;
ec.setClusterTolerance (0.1);
ec.setMinClusterSize (2000);
ec.setMaxClusterSize (250000);
ec.setSearchMethod (octree_device);
ec.setHostCloud (cloud_filtered);

ec.extract (cluster_indices);

I have installed CUDA and included the needed pcl/gpu ".hpp"s to do this. It compiles (I have a catkin workspace with ROS) and when I do run it works really slow. I used nvidia-smi and my code is only running in the CPU, and I don't know why and how to solve it.

This code is an implementation of the gpu/segmentation example here: pcl/seg.cpp


回答1:


(Making this an answer since it's too long for a comment.)

I don't know pcl, but maybe it's because you pass a host-side std::vector rather than data that's on the device side.

... what is "host side" and "device side", you ask? And what's std?

Well, std is just a namespace used by the C++ standard library. std::vector is a (templated) class in the C++ standard library, which dynamically allocates memory for the elements you put in it.

The thing is, the memory std::vector uses is your main system memory (RAM) which doesn't have anything to do with the GPU. But it's likely that your pcl library requires that you pass data that's in GPU memory - which can't be the data in an std::vector. You would need to allocate device-side memory and copy your data there from the host side memory.

See also:

Why we do not have access to device memory on host side?

and consult the CUDA programming guide regarding how to perform this allocation and copying (at least, how to perform it at the lowest possible level; your "pcl" may have its own facilities for this.)



来源:https://stackoverflow.com/questions/54705053/why-is-my-pcl-cuda-code-running-in-cpu-instead-of-gpu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!