Mix custom memory management and Thrust in CUDA

后端 未结 1 937
暖寄归人
暖寄归人 2021-01-01 03:19

In my project, I have implemented a custom memory allocator to avoid unneccessary calls to cudaMalloc once the application has \"warmed up\". Moreover, I use cu

相关标签:
1条回答
  • 2021-01-01 04:19

    Like all standard c++ containers, you can customize how thrust::device_vector allocates storage by providing it with your own "allocator". By default, thrust::device_vector's allocator is thrust::device_malloc_allocator, which allocates (deallocates) storage with cudaMalloc (cudaFree) when Thrust's backend system is CUDA.

    Occasionally, it is desirable to customize the way device_vector allocates memory, such as in the OP's case, who would like to sub-allocate storage within a single large allocation performed at program initialization. This can avoid overhead which may be incurred by many individual calls to the underlying allocation scheme, in this case, cudaMalloc.

    A simple way to provide device_vector a custom allocator is to inherit from device_malloc_allocator. One could in principle author an entire allocator from scratch, but with an inheritance approach, only the allocate and deallocate member functions need to be provided. Once the custom allocator is defined, it can be provided to device_vector as its second template parameter.

    This example code demonstrates how to provide a custom allocator which prints a message upon allocation and deallocation:

    #include <thrust/device_malloc_allocator.h>
    #include <thrust/device_vector.h>
    #include <iostream>
    
    template<typename T>
      struct my_allocator : thrust::device_malloc_allocator<T>
    {
      // shorthand for the name of the base class
      typedef thrust::device_malloc_allocator<T> super_t;
    
      // get access to some of the base class's typedefs
    
      // note that because we inherited from device_malloc_allocator,
      // pointer is actually thrust::device_ptr<T>
      typedef typename super_t::pointer   pointer;
    
      typedef typename super_t::size_type size_type;
    
      // customize allocate
      pointer allocate(size_type n)
      {
        std::cout << "my_allocator::allocate(): Hello, world!" << std::endl;
    
        // defer to the base class to allocate storage for n elements of type T
        // in practice, you'd do something more interesting here
        return super_t::allocate(n);
      }
    
      // customize deallocate
      void deallocate(pointer p, size_type n)
      {
        std::cout << "my_allocator::deallocate(): Hello, world!" << std::endl;
    
        // defer to the base class to deallocate n elements of type T at address p
        // in practice, you'd do something more interesting here
        super_t::deallocate(p,n);
      }
    };
    
    int main()
    {
      // create a device_vector which uses my_allocator
      thrust::device_vector<int, my_allocator<int> > vec;
    
      // create 10 ints
      vec.resize(10, 13);
    
      return 0;
    }
    

    Here's the output:

    $ nvcc my_allocator_test.cu -arch=sm_20 -run
    my_allocator::allocate(): Hello, world!
    my_allocator::deallocate(): Hello, world!
    

    In this example, note that we hear from my_allocator::allocate() once upon vec.resize(10,13). my_allocator::deallocate() is invoked once when vec goes out of scope as it destroys its elements.

    0 讨论(0)
提交回复
热议问题