How to get rid of CUDA out of memory without having to restart the machine?
You could use try using torch.cuda.empty_cache(), since PyTorch is the one that's occupying the CUDA memory.
Related videos on Youtube
Mona Jalal
contact me at [email protected] I am a 5th-year computer science Ph.D. Candidate at Boston University advised by Professor Vijaya Kolachalama in computer vision as the area of study. Currently, I am working on my proposal exam and thesis on the use of efficient computer vision and deep learning for cancer detection in H&E stained digital pathology images.
Updated on September 18, 2022Comments
-
Mona Jalal over 1 year
Is there a hack in Ubuntu 20.04 to get rid of the following CUDA out of memory error without having to restart the machine?
RuntimeError: CUDA out of memory. Tried to allocate 40.00 MiB (GPU 0; 7.80 GiB total capacity; 6.34 GiB already allocated; 32.44 MiB free; 6.54 GiB reserved in total by PyTorch)
I understand that the following works but then also kills my Jupyter notebook. Is there a way to free up memory in GPU without having to kill the Jupyter notebook?
(base) mona@mona:~/research/facial_landmark$ nvidia-smi Tue Oct 6 20:28:05 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 2070 Off | 00000000:01:00.0 Off | N/A | | N/A 47C P8 9W / N/A | 7883MiB / 7982MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1306 G /usr/lib/xorg/Xorg 255MiB | | 0 N/A N/A 1743 G /usr/bin/gnome-shell 151MiB | | 0 N/A N/A 3273 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 3359 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 3844 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 4222 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 4587 C ...mona/anaconda3/bin/python 7459MiB | +-----------------------------------------------------------------------------+ (base) mona@mona:~/research/facial_landmark$ kill -9 4587 (base) mona@mona:~/research/facial_landmark$ nvidia-smi Tue Oct 6 20:28:24 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 2070 Off | 00000000:01:00.0 Off | N/A | | N/A 47C P8 9W / N/A | 433MiB / 7982MiB | 4% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1306 G /usr/lib/xorg/Xorg 255MiB | | 0 N/A N/A 1743 G /usr/bin/gnome-shell 152MiB | | 0 N/A N/A 3273 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 3359 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 3844 G /usr/lib/firefox/firefox 2MiB | | 0 N/A N/A 4222 G /usr/lib/firefox/firefox 2MiB | +-----------------------------------------------------------------------------+ (base) mona@mona:~/research/facial_landmark$
-
Jean Monet about 3 yearsIf for example I shut down my Jupyter kernel without first
x.detach.cpu()
thendel x
thentorch.cuda.empty_cache()
, it becomes impossible to free that memorey from a different notebook. So the solution would not work. Astonished to see that in 2021 it's such a pain to delete stuff from cuda memory.