Get GPU memory usage programmatically

c++ cuda opencl gpu

15,320

Solution 1

cudaMemGetInfo (documented here) requires nothing other than the cuda runtime API to get free memory and total memory on the current device.

And as Erik pointed out, there is similar functionality in NVML.

Solution 2

D3DKMTQueryStatistics is what you need.

Similar question has been asked here: How to query GPU Usage in DirectX?

Solution 3

Check out the function nvmlDeviceGetMemoryInfo in NVIDIA Management Library https://developer.nvidia.com/nvidia-management-library-nvml:

"Retrieves the amount of used, free and total memory available on the device, in bytes."

Don't know if AMD has something equivalent.

15,320

roboto1986

Updated on September 23, 2022

Comments

roboto1986 over 1 year
I'm looking for a reliable way to determine current GPU memory usage preferably in C++/C . I have found many ways of obtaining usage like the following methods:
- Direct Draw
- DxDiag
- WMI
- DXGI
- D3D9
Those methods are not accurate enough (most off by a hundred megabytes). I tried nvapi.h but I didn't see anything that I could use to query for memory. I was thinking only the methods listed above were the only options but then I ran into a tool called GPU-Z that gives me accurate memory readings to the nearest megabyte even when OpenCL runs almost full load on my 580GTX. I can verify I am at the peak of my memory usage by allocating a few more megabytes before OpenCL returns Object_Allocation fail return code.

Looking at the imports from GPU-Z, I see nothing interesting other than:

kernel32.dll: LoadLibraryA, GetProcAddress, VirtualAlloc, VirtualFree

My guess is LoadLibraryA must be used to load a dll for querying the GPU memory and sensors. If this dll exists, where does it live? I'm looking for a solution for AMD and NVidia if possible (using different APIs is ok).
- Brian Cain almost 11 years
  
  "most off my a hundred megabytes" -- what's the known good reference that you're using?
- roboto1986 almost 11 years
  
  It's good to scrutinize GPU-Z (as I have also done) but as I mentioned on my post, if I am near the top of my memory usage on my 580-GTX which appears to have a 3GB limit, I get allocation failure with OpenCL. I also see that when I create a context for my gpu it occupies 60MB and when my gpu is not used, I get 0MB of memory usage (my 580 only computes while a 440gtx does the display). GPU-Z could very well be wrong but why is it different than the other methods? I also know from my algorithm how much each section of code my allocates what and it is consistent with GPU-Z readings.
roboto1986 almost 11 years

I have not seen this before, thank you. I will need time to test it :)
roboto1986 almost 11 years

Thank You! I will try this. I'll mark the correct answer soon.
roboto1986 almost 11 years

Thanks for the info. However, I was not able to get a proper device count using nvmlUnitGetCount() which returned 0 devices. I called nvmlInit() and the return status was successful and then I followed up with nvmlUnitGetCount() and its returned status was also successful but it returned 0 devices. Any ideas?
roboto1986 almost 11 years

I tried it out and this method gives me the precision I want without having to instal other sdks :) Now, I'm still out of luck for ATI cards. If you have an idea, please let me know. Else, I might just go with DX methods.
Erik Smistad almost 11 years

When OpenCL gives me 0 devices/platforms, I reinstall the display driver, which usually works. Other than that I have no suggestions. Sorry.