will array_view.synchronize_asynch wait for parallel_for_each completion?

前端 未结 1 379
醉话见心
醉话见心 2021-01-25 17:42

If I have a concurrency::array_view being operated on in a concurrency::parallel_for_each loop, my understanding is that I can continue other tasks on

相关标签:
1条回答
  • I don't like the way this has been explained. A better way to think about it is that the parallel_for_each queues work to the GPU, so it returns almost immediately. There are numerous ways that your CPU-side code can block until the queued work is complete, for example, explicitly calling synchronize, or accessing data from one of the array_view instances used within the parallel_for_each

    using namespace concurrency;
    
    array_view<int> av;
    parallel_for_each(extent<1>(number),[=](index<1> idx)
    {
      // Queue (or schedule if you like) some intense computations on av
    }
    

    Host code can execute now. The AMP computations may or may not have started. If the code here accesses av then it will block until the work on the GPU is complete and the data in av has been written and can be synchronized with the host memory.

    This is a future so it is also a scheduled task. It is not guaranteed to execute at any particular point. Should it be scheduled then it will block the thread it is running on until av is correctly synchronized with the host memory (as above).

    completion_future waitOnThis = av.synchronize_asynch();
    

    More host code can execute here. If the host code accesses av then it will block until the parallel_for_each has completed (as above). At some point the runtime will execute the future and block until av has synchronized with the host memory. If it is writable and has been changed then it will be copied back to the host memory.

    completion_future.wait();
    

    The call to wait will block until the future has completed (prior to calling wait there is no guarantee that anything has actually executed). At this point you are guaranteed that the GPU calculations are complete and that av can be accessed on the CPU.

    Having said all that adding the waitOnThis future seems to be over complicating matters.

    array_view<int> av;
    parallel_for_each(extent<1>(number),[=](index<1> idx)
    {
      // do some intense computations on av on the GPU
    }
    
    // do some independent CPU computation here.
    
    av.synchronize();
    
    // do some computation on the CPU that relies on av here.
    

    The MSDN docs aren't very good on this topic. The following blog post is better. There are some other posts on the async APIs on the same blog.

    0 讨论(0)
提交回复
热议问题