How does CUDA assign device IDs to GPUs?
Solution 1
CUDA picks the fastest device as device 0. So when you swap GPUs in and out the ordering might change completely. It might be better to pick GPUs based on their PCI bus id using:
cudaError_t cudaDeviceGetByPCIBusId ( int* device, char* pciBusId )
Returns a handle to a compute device.
cudaError_t cudaDeviceGetPCIBusId ( char* pciBusId, int len, int device )
Returns a PCI Bus Id string for the device.
or CUDA Driver API cuDeviceGetByPCIBusId
cuDeviceGetPCIBusId
.
But IMO the most reliable way to know which device is which would be to use NVML or nvidia-smi to get each device's unique identifier (UUID) using nvmlDeviceGetUUID
and then match it do CUDA device with pciBusId using nvmlDeviceGetPciInfo
.
Solution 2
Set the environment variable CUDA_DEVICE_ORDER
as:
export CUDA_DEVICE_ORDER=PCI_BUS_ID
Then the GPU IDs will be ordered by pci bus IDs.
Solution 3
The CUDA Support/Choosing a GPU suggest that
when running a CUDA program on a machine with multiple GPUs, by default CUDA kernels will execute on whichever GPU is installed in the primary graphics card slot.
Also, the discussion at No GPU selected, code working properly, how's this possible? suggests that CUDA does not map the "best" card to device 0 in general.
EDIT
Today I have installed a PC with a Tesla C2050 card for computation and a 8084 GS card for visualization switching their position between the first two PCI-E slots. I have used deviceQuery and noticed that GPU 0
is always that in the first PCI slot and GPU 1
is always that in the second PCI slot. I do not know if this is a general statement, but it is a proof that for my system GPUs are numbered not according to their "power", but according to their positions.
Solution 4
The best solution I have found (tested in tensorflow==2.3.0
) is to add the following before anything that may import tensorflow
:
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,3" # specify which GPU(s) to be used
This way, the order that TensorFlow orders the GPUs will be the same as that reported by tools such as nvidia-smi
or nvtop
.
Comments
-
solvingPuzzles about 3 years
When a computer has multiple CUDA-capable GPUs, each GPU is assigned a
device ID
. By default, CUDA kernels execute ondevice ID 0
. You can usecudaSetDevice(int device)
to select a different device.Let's say I have two GPUs in my machine: a GTX 480 and a GTX 670. How does CUDA decide which GPU is
device ID 0
and which GPU isdevice ID 1
?
Ideas for how CUDA might assign device IDs (just brainstorming):
- descending order of compute capability
- PCI slot number
- date/time when the device was added to system (device that was just added to computer is higher ID number)
Motivation: I'm working on some HPC algorithms, and I'm benchmarking and autotuning them for several GPUs. My processor has enough PCIe lanes to drive cudaMemcpys to 3 GPUs at full bandwidth. So, instead of constantly swapping GPUs in and out of my machine, I'm planning to just keep 3 GPUs in my computer. I'd like to be able to predict what will happen when I add or replace some GPUs in the computer.