Detecting ptx kernel of Thrust transform

前端 未结 2 1015
南笙
南笙 2020-12-12 05:54

I have following thrust::transform call.

my_functor *f_1 = new my_functor();
thrust::transform(data.begin(), data.end(), data.begin(),*f_1);
<
相关标签:
2条回答
  • 2020-12-12 06:39

    Use c++filt command. When I pass your example kernel names through c++filt, I get

    void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, unsigned int, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array> >(thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, unsigned int, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array>)
    
    void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, long, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array> >(thrust::system::cuda::detail::for_each_n_detail::for_each_n_closure<thrust::zip_iterator<thrust::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::detail::normal_iterator<thrust::device_ptr<int> >, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> >, long, thrust::detail::device_unary_transform_functor<my_functor>, thrust::system::cuda::detail::detail::blocked_thread_array>)
    
    thrust::detail::device_function<thrust::detail::device_unary_transform_functor<my_functor>, void>::device_function(thrust::detail::device_unary_transform_functor<my_functor> const&)
    

    Command:

    cuobjdump file.cu.o --dump-elf-symbols | grep STB_GLOBAL | tr -s " " | cut -d" " -f4,4 | c++filt 
    
    0 讨论(0)
  • 2020-12-12 06:42

    If you are using Visual Studio, use Nvidia NSIGHT Visual Studio Edition which comes with the CUDA Toolkit.

    Go to the "Nsight" menu, click on the "Start Performance Analysis..." entry.

    • In "Activity type", select "Profile CUDA Application"
    • In "Experiment settings", tick "Collect Information for CUDA Source View"
    • Choose "All" in the "Experiments to Run" listbox
    • In "Capture Control", tick "Open Report on Stop" and select "CUDA Source View" in the listbox

    Then, click on "Launch" and wait for your application to be fully executed. You will see additional output in the console from Nsight.

    After the execution, the "CUDA Source View" window will open. - Select "Source and PTX" in the "View" listbox You will be able to find the correspondance between source code and generated PTX. When you click on a line in the source code, one or more lines are highlighted in green in the PTX code.

    0 讨论(0)
提交回复
热议问题