CUDA streams and context

后端 未结 1 2090
天涯浪人
天涯浪人 2021-02-10 00:19

I am using an application presently that spawns a bunch of pthreads (linux), and each of those creates it\'s own CUDA context. (using cuda 3.2 right now).

The problem I

相关标签:
1条回答
  • 2021-02-10 00:33

    Each CUDA context does cost quite a bit of device memory, and their resources are strictly partitioned from one another. For example, device memory allocated in context A cannot be accessed by context B. Streams also are valid only in the context in which they were created.

    The best practice would be to create one CUDA context per device. By default, that CUDA context can be accessed only from the CPU thread that created it. If you want to access the CUDA context from other threads, call cuCtxPopCurrent() to pop it from the thread that created it. The context then can be pushed onto any other CPU thread's current context stack, and subsequent CUDA calls would reference that context.

    Context push/pop are lightweight operations and as of CUDA 3.2, they can be done in CUDA runtime apps. So my suggestion would be to initialize the CUDA context, then call cuCtxPopCurrent() to make the context "floating" unless some threads wants to operate it. Consider the "floating" state to be the natural one - whenever a thread wants to manipulate the context, bracket its usage with cuCtxPushCurrent()/cuCtxPopCurrent().

    0 讨论(0)
提交回复
热议问题