问题
Assume on a single node, there are several devcies with different compute capabilities, how nvidia rank them (by rank I mean the number assigned by cudaSetDevice)?
Are there any general guideline about this? thanks.
回答1:
I believe the ordering of devices corresponding to cudaGetDevice and cudaSetDevice (i.e. the CUDA runtime enumeration order should be either based on a heuristic that determines the fastest device and makes it first or else based on PCI enumeration order. You can confirm this using the deviceQuery sample, which prints the properties of devices (including PCI ID) based on the order they get enumerated in for cudaSetDevice.
However I would recommend not to base any decisions on this. There's nothing magical about PCI enumeration order, and even things like a system BIOS upgrade can change the device enumeration order (as can swapping devices, moving to another system, etc.)
It's usually best to query devices (see the deviceQuery sample) and then make decisions based on the specific devices returned and/or their properties. You can also use cudaChooseDevice to select a device heuristically.
You can cause the CUDA runtime to choose either "Faster First" or "PCI Enumeration Order" based on the setting (or lack of) an environment variable in CUDA 8.
来源:https://stackoverflow.com/questions/15961878/how-do-the-nvidia-drivers-assign-device-indices-to-gpus