How to limit memory usage within a python process
resource.RLIMIT_VMEM
is the resource corresponding to ulimit -v
.
RLIMIT_DATA
only affects brk/sbrk
system calls while newer memory managers tend to use mmap
instead.
The second thing to note is that ulimit
/setrlimit
only affects the current process and its future children.
Regarding the AttributeError: 'module' object has no attribute 'RLIMIT_VMEM'
message: the resource
module docs mention this possibility:
This module does not attempt to mask platform differences — symbols not defined for a platform will not be available from this module on that platform.
According to the bash
ulimit
source linked to above, it uses RLIMIT_AS
if RLIMIT_VMEM
is not defined.
Related videos on Youtube
Arne
I am interested in Cloud Services and Machine Learning, and generally programming in Python. When I am not writing code you can find me cooking, running, hiking, reading, doodling, writing, or loafing.
Updated on September 18, 2022Comments
-
Arne over 1 year
I run Python 2.7 on a Linux machine with 16GB Ram and 64 bit OS. A python script I wrote can load too much data into memory, which slows the machine down to the point where I cannot even kill the process any more.
While I can limit memory by calling:
ulimit -v 12000000
in my shell before running the script, I'd like to include a limiting option in the script itself. Everywhere I looked, the
resource
module is cited as having the same power asulimit
. But calling:import resource _, hard = resource.getrlimit(resource.RLIMIT_DATA) resource.setrlimit(resource.RLIMIT_DATA, (12000, hard))
at the beginning of my script does absolutely nothing. Even setting the value as low as 12000 never crashed the process. I tried the same with
RLIMIT_STACK
, as well with the same result. Curiously, calling:import subprocess subprocess.call('ulimit -v 12000', shell=True)
does nothing as well.
What am I doing wrong? I couldn't find any actual usage examples online.
edit: For anyone who is curious, using
subprocess.call
doesn't work because it creates a (surprise, surprise!) new process, which is independent of the one the current python program runs in.-
TigerhawkT3 almost 9 yearsIs there any room to make the program more memory-efficient?
-
Arne almost 9 yearsThere is, but that will take a while. At the moment, I need to test it and make sure that it doesn't shut the computer down. And having a fail-safe for the memory will be useful later, too.
-
TigerhawkT3 almost 9 yearsSince it's in Python 2.7, how about switching to Python 3 and using a 2-to-3 converter on your program? Python 3 has several performance improvements over Python 2, some of which are memory-related.
-
Arne almost 9 yearsI will do that -- but at this point, I am just curious if or how limiting memory works in python.
-
oxymor0n almost 9 yearscan't you control what you load into memory? i mean, isn't it YOUR script?
-
Arne almost 9 yearsIt's connected to a constant stream (from the Twitter Streaming API), so.. control is not trivial. I will control it at some point, but right now I just want it not to crash.
-
Gordon Bean over 8 yearsThis issue comes up in interactive data analysis all the time - you load a large array (say, 8GB) and start your work. Then you inadvertently square the array (a typo in your code, or misunderstanding an API, etc) and now the script requests 64 GB and the system freezes. :P You would much rather have the process killed than restart your computer.
-
-
Arne almost 9 yearsI don't use multithreading, so I hope that is not the problem. But when I enter
RLIMIT_DATA
, I get the following error message:Traceback (most recent call last): File "my_script.py", line 417, in <module> sys.exit(main()) File "my_script.py", line 391, in main _, hard = resource.getrlimit(resource.RLIMIT_VMEM) AttributeError: 'module' object has no attribute 'RLIMIT_VMEM'
From the list you referenced, all fields could be found -- except this one. I am trying to run it with Python 3.x right now..