Python Multiprocessing with PyCUDA

前端未结

关注

 2  738

南方客 2021-01-30 14:46

I\'ve got a problem that I want to split across multiple CUDA devices, but I suspect my current system architecture is holding me back;

What I\'ve set up is a GPU class

2条回答

北恋 (楼主)

2021-01-30 15:43
You need to get all your bananas lined up on the CUDA side of things first, then think about the best way to get this done in Python [shameless rep whoring, I know].

The CUDA multi-GPU model is pretty straightforward pre 4.0 - each GPU has its own context, and each context must be established by a different host thread. So the idea in pseudocode is:
1. Application starts, process uses the API to determine the number of usable GPUS (beware things like compute mode in Linux)
2. Application launches a new host thread per GPU, passing a GPU id. Each thread implicitly/explicitly calls equivalent of cuCtxCreate() passing the GPU id it has been assigned
3. Profit!
In Python, this might look something like this:
```
import threading
from pycuda import driver

class gpuThread(threading.Thread):
    def __init__(self, gpuid):
        threading.Thread.__init__(self)
        self.ctx  = driver.Device(gpuid).make_context()
        self.device = self.ctx.get_device()

    def run(self):
        print "%s has device %s, api version %s"  \
             % (self.getName(), self.device.name(), self.ctx.get_api_version())
        # Profit!

    def join(self):
        self.ctx.detach()
        threading.Thread.join(self)

driver.init()
ngpus = driver.Device.count()
for i in range(ngpus):
    t = gpuThread(i)
    t.start()
    t.join()
```
This assumes it is safe to just establish a context without any checking of the device beforehand. Ideally you would check the compute mode to make sure it is safe to try, then use an exception handler in case a device is busy. But hopefully this gives the basic idea.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...