Detecting ptx kernel of Thrust transform

前端未结

关注

 2  1017

I have following thrust::transform call.

my_functor *f_1 = new my_functor();
thrust::transform(data.begin(), data.end(), data.begin(),*f_1);

相关标签:

2条回答

故里飘歌

2020-12-12 06:39

Use c++filt command. When I pass your example kernel names through c++filt, I get

void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, unsigned int, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array> >(thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, unsigned int, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array>)

void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, long, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array> >(thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, long, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array>)

thrust::detail::device_function<thrust::detail::device_unary_transform_functor<my_functor>, void>::device_function(thrust::detail::device_unary_transform_functor<my_functor> const&)

Command:

cuobjdump file.cu.o --dump-elf-symbols | grep STB_GLOBAL | tr -s " " | cut -d" " -f4,4 | c++filt

0 讨论(0)

借酒劲吻你

2020-12-12 06:42
If you are using Visual Studio, use Nvidia NSIGHT Visual Studio Edition which comes with the CUDA Toolkit.

Go to the "Nsight" menu, click on the "Start Performance Analysis..." entry.
- In "Activity type", select "Profile CUDA Application"
- In "Experiment settings", tick "Collect Information for CUDA Source View"
- Choose "All" in the "Experiments to Run" listbox
- In "Capture Control", tick "Open Report on Stop" and select "CUDA Source View" in the listbox
Then, click on "Launch" and wait for your application to be fully executed. You will see additional output in the console from Nsight.

After the execution, the "CUDA Source View" window will open. - Select "Source and PTX" in the "View" listbox You will be able to find the correspondance between source code and generated PTX. When you click on a line in the source code, one or more lines are highlighted in green in the PTX code.
0 讨论(0)
发布评论:

提交评论
- 加载中...