GPU usage monitoring (CUDA)

monitoring gpu

761,250

Solution 1

For Nvidia GPUs there is a tool nvidia-smi that can show memory usage, GPU utilization and temperature of GPU. There also is a list of compute processes and few more options but my graphic card (GeForce 9600 GT) is not fully supported.

Sun May 13 20:02:49 2012       
+------------------------------------------------------+                       
| NVIDIA-SMI 3.295.40   Driver Version: 295.40         |                       
|-------------------------------+----------------------+----------------------+
| Nb.  Name                     | Bus Id        Disp.  | Volatile ECC SB / DB |
| Fan   Temp   Power Usage /Cap | Memory Usage         | GPU Util. Compute M. |
|===============================+======================+======================|
| 0.  GeForce 9600 GT           | 0000:01:00.0  N/A    |       N/A        N/A |
|   0%   51 C  N/A   N/A /  N/A |  90%  459MB /  511MB |  N/A      Default    |
|-------------------------------+----------------------+----------------------|
| Compute processes:                                               GPU Memory |
|  GPU  PID     Process name                                       Usage      |
|=============================================================================|
|  0.           Not Supported                                                 |
+-----------------------------------------------------------------------------+

Solution 2

For linux, use nvidia-smi -l 1 will continually give you the gpu usage info, with in refresh interval of 1 second.

Solution 3

Recently I have written a simple command-line utility called gpustat (which is a wrapper of nvidia-smi) : please take a look at https://github.com/wookayin/gpustat.

Solution 4

For Intel GPU's there exists the intel-gpu-tools from http://intellinuxgraphics.org/ project, which brings the command intel_gpu_top (amongst other things). It is similar to top and htop, but specifically for the Intel GPU.

   render busy:  18%: ███▋                                   render space: 39/131072
bitstream busy:   0%:                                     bitstream space: 0/131072
  blitter busy:  28%: █████▋                                blitter space: 28/131072

          task  percent busy
           GAM:  33%: ██████▋                 vert fetch: 0 (0/sec)
          GAFS:   3%: ▋                       prim fetch: 0 (0/sec)
            VS:   0%:                      VS invocations: 559188 (150/sec)
            SF:   0%:                      GS invocations: 0 (0/sec)
            VF:   0%:                           GS prims: 0 (0/sec)
            DS:   0%:                      CL invocations: 186396 (50/sec)
            CL:   0%:                           CL prims: 186396 (50/sec)
           SOL:   0%:                      PS invocations: 8191776208 (38576436/sec)
            GS:   0%:                      PS depth pass: 8158502721 (38487525/sec)
            HS:   0%:                      
            TE:   0%:                      
          GAFM:   0%:                      
           SVG:   0%:

Solution 5

nvidia-smi does not work on some linux machines (returns N/A for many properties). You can use nvidia-settings instead (this is also what mat kelcey used in his python script).

nvidia-settings -q GPUUtilization -q useddedicatedgpumemory

You can also use:

watch -n0.1 "nvidia-settings -q GPUUtilization -q useddedicatedgpumemory"

for continuous monitoring.

View more solutions

761,250

pbm

Updated on September 18, 2022

Comments

pbm over 1 year

I installed CUDA toolkit on my computer and started BOINC project on GPU. In BOINC I can see that it is running on GPU, but is there a tool that can show me more details about that what is running on GPU - GPU usage and memory usage?
Raphael almost 12 years

My ION chip does not show usage, either. :/
Graham Perrin over 11 years

For Linux and for some versions of Windows, but not for Unix – nvidia-smi ships with NVIDIA GPU display drivers on Linux, and with 64-bit Windows Server 2008 R2 and Windows 7.
Score_Under almost 9 years

Glad this wasn't a comment. It's exactly what I was searching for when I came across this question.
alexg over 8 years

Thanks, this is what worked for me, since I have a GeForce card which is not supported by nvidia-smi.
ali_m over 8 years

I prefer to use watch -n 1 nvidia-smi to obtain continuous updates without filling the terminal with output
ruoho ruotsi over 8 years

Thanks man, good idea to query all, since each card may have different strings to monitor!
Bar almost 8 years

watch -n 0.5 nvidia-smi, will keep the output updated without filling your terminal with output.
Victor Sergienko almost 7 years

nvidia-settings requires a running X11, which is not always the case.
zeekvfu over 6 years

@Bar Good tip. watch -d -n 0.5 nvidia-smi will be even better.
Mick T about 6 years

Using watch means your starting a new process every second to poll the cards. Better to do -l, and not every second, I'd suggest every minute or every 5 minutes.
SebMa about 6 years

Carefull, I don't think the pmem given by ps takes into account the total memory of the GPU but that of the CPU because ps is not "Nvidia GPU" aware
Zoltan over 5 years

It would be worth adding memory.used or (memory.free) as well.
donlucacorleone over 5 years

@zeekvfu I think it'd be better to explain what does the -d flag do
Jay Mayer over 5 years

@donlucacorleone man watch tells us the -d flag highlights differences between the outputs, so it can aid in highlighting which metrics are changing over time.
Douglas Daseeco over 5 years

If the above format is too black box (un-configurable) for you or you find the screen too busy, try unix.stackexchange.com/questions/38560/….
hLk almost 5 years

If you just want the number and nothing else (eg. for conky) use this: nvidia-settings -q [gpu:0]/UsedDedicatedGPUMemory -t
Jaleks about 4 years

instead of watch -n 0.5 nvidia-smi you can also use nvidia-smi --loop-ms=500
Ricky Robinson about 4 years

Is starting a new process at that rate so detrimental?
Kokizzu almost 4 years

I'm using official radeon proprietary driver, but aticonfig command does not exists '__')
Kevin almost 4 years

@Kokizzu 7 years a long time makes, Linux changes a lot :)
steve mais over 3 years

for me it gives : Unable to init server: Could not connect: Connection refused
starbeamrainbowlabs almost 3 years

nvtop is broken on Ubuntu 20.10
Sudharsan Madhavan almost 3 years

Works like charm with just conda virtual environment without sudo access if anyone is looking for a solution WITHOUT admin access.
Xuehai Pan almost 3 years

For non-sudo users, pip install nvitop will install into ~/.local/bin by default. Users can add --user option to pip explicitly to make a user-wise install. Then you may need to add ~/.local/bin into your PATH environment variable. If there is no system Python installed, you can use Linuxbrew or conda to install Python in your home directory.
Hyperplane over 2 years

I really like this because it shows Time-Series for both CPU & GPU. Many tools only show current usage, or Time-Series for either, but not for both. 👍
MSalters about 2 years

@Hossein: That might be because nvidia-settings looks at the X Display variable $DISPLAY. In a GPGPU server, that won't work - if only because such servers typically have multiple GPU's
Admin almost 2 years

Wht does "off" mean in this?` 0 GeForce GT 730 Off `