How to limit memory usage within a python process

18,000

resource.RLIMIT_VMEM is the resource corresponding to ulimit -v.

RLIMIT_DATA only affects brk/sbrk system calls while newer memory managers tend to use mmap instead.

The second thing to note is that ulimit/setrlimit only affects the current process and its future children.

Regarding the AttributeError: 'module' object has no attribute 'RLIMIT_VMEM' message: the resource module docs mention this possibility:

This module does not attempt to mask platform differences — symbols not defined for a platform will not be available from this module on that platform.

According to the bash ulimit source linked to above, it uses RLIMIT_AS if RLIMIT_VMEM is not defined.

Share:
18,000

Related videos on Youtube

Arne
Author by

Arne

I am interested in Cloud Services and Machine Learning, and generally programming in Python. When I am not writing code you can find me cooking, running, hiking, reading, doodling, writing, or loafing.

Updated on September 18, 2022

Comments

  • Arne
    Arne over 1 year

    I run Python 2.7 on a Linux machine with 16GB Ram and 64 bit OS. A python script I wrote can load too much data into memory, which slows the machine down to the point where I cannot even kill the process any more.

    While I can limit memory by calling:

    ulimit -v 12000000
    

    in my shell before running the script, I'd like to include a limiting option in the script itself. Everywhere I looked, the resource module is cited as having the same power as ulimit. But calling:

    import resource
    _, hard = resource.getrlimit(resource.RLIMIT_DATA)
    resource.setrlimit(resource.RLIMIT_DATA, (12000, hard))
    

    at the beginning of my script does absolutely nothing. Even setting the value as low as 12000 never crashed the process. I tried the same with RLIMIT_STACK, as well with the same result. Curiously, calling:

    import subprocess
    subprocess.call('ulimit -v 12000', shell=True)
    

    does nothing as well.

    What am I doing wrong? I couldn't find any actual usage examples online.


    edit: For anyone who is curious, using subprocess.call doesn't work because it creates a (surprise, surprise!) new process, which is independent of the one the current python program runs in.

    • TigerhawkT3
      TigerhawkT3 almost 9 years
      Is there any room to make the program more memory-efficient?
    • Arne
      Arne almost 9 years
      There is, but that will take a while. At the moment, I need to test it and make sure that it doesn't shut the computer down. And having a fail-safe for the memory will be useful later, too.
    • TigerhawkT3
      TigerhawkT3 almost 9 years
      Since it's in Python 2.7, how about switching to Python 3 and using a 2-to-3 converter on your program? Python 3 has several performance improvements over Python 2, some of which are memory-related.
    • Arne
      Arne almost 9 years
      I will do that -- but at this point, I am just curious if or how limiting memory works in python.
    • oxymor0n
      oxymor0n almost 9 years
      can't you control what you load into memory? i mean, isn't it YOUR script?
    • Arne
      Arne almost 9 years
      It's connected to a constant stream (from the Twitter Streaming API), so.. control is not trivial. I will control it at some point, but right now I just want it not to crash.
    • Gordon Bean
      Gordon Bean over 8 years
      This issue comes up in interactive data analysis all the time - you load a large array (say, 8GB) and start your work. Then you inadvertently square the array (a typo in your code, or misunderstanding an API, etc) and now the script requests 64 GB and the system freezes. :P You would much rather have the process killed than restart your computer.
  • Arne
    Arne almost 9 years
    I don't use multithreading, so I hope that is not the problem. But when I enter RLIMIT_DATA, I get the following error message: Traceback (most recent call last): File "my_script.py", line 417, in <module> sys.exit(main()) File "my_script.py", line 391, in main _, hard = resource.getrlimit(resource.RLIMIT_VMEM) AttributeError: 'module' object has no attribute 'RLIMIT_VMEM' From the list you referenced, all fields could be found -- except this one. I am trying to run it with Python 3.x right now..