Python MemoryError on large array

12,796

Solution 1

The size of the list you are generating (which is 50 billion not 5).

An int object instance takes 24 bytes (sys.getsizeof(int(899999)), the upper limit of your random numbers), so that list would take 50,000,000,000 * 24 bytes, which is about 1.09 TB.

In other words to create such a list you would need at least 1118 GB of RAM in your computer.

I don't know what your use case is, but you should consider a different approach to what you are trying to solve (maybe define a generator, or just don't store your numbers in memory and instead directly use the numbers in the for loop).

Solution 2

Since other people already answered your question here's a quick tip when dealing with big numbers: you can use "_" to separate the digits of your numbers as you wish:

n = 50_000_000_000

is the same as

n = 50000000000

but the former is much easier on the eyes

Share:
12,796
Adit Srivastava
Author by

Adit Srivastava

Updated on August 16, 2022

Comments

  • Adit Srivastava
    Adit Srivastava over 1 year

    This is the python script that I'm trying to run:

    n = 50000000000 ##50 billion 
    b = [0]*n
    for x in range(0,n):
        b[x] = random.randint(1,899999)
    

    ... But the output I'm getting is:

    E:\python\> python sort.py
    Traceback (most recent call last):
      File "E:\python\sort.py", line 8, in <module>
        b = [0]*n
    MemoryError
    

    So, what do I do now?

    • Torxed
      Torxed almost 7 years
      You're out of memory. The error message says so.
    • litelite
      litelite almost 7 years
      You are trying to allocate a 40GB array
    • chepner
      chepner almost 7 years
      An int takes at least 24 bytes; you have 5,000,000,000 ints. That's 111GB right there.
    • alexis
      alexis almost 7 years
      Maybe it would be more useful if you explained what you were planning to do with so many random numbers. Almost certainly there's a way to do it without blowing out your memory.
    • litelite
      litelite almost 7 years
      @chepner ints in python seems to be 64 bits, so 8 bytes. 24 bytes would be one heck of a number
    • chepner
      chepner almost 7 years
      @litelite No, an int is a Python object, representing an integer with arbitrary precision, not a machine word.
    • alexis
      alexis almost 7 years
      That's right, Python has overhead in order to keep track of types etc. You want an 8-byte int, use C.
    • litelite
      litelite almost 7 years
      @chepner Right, not a pro in python I keep forgeting about that
    • user2357112
      user2357112 almost 7 years
      Also you counted your zeros wrong, so n is 50 billion, not 5 billion.
    • alexis
      alexis almost 7 years
      Also, ints in Python are not even 64 bits; Python has unlimited-length integers, welcome to the modern world.
    • juanpa.arrivillaga
      juanpa.arrivillaga almost 7 years
      @alexis or numpy, or array.array
    • alexis
      alexis almost 7 years
      You mean to limit the memory footprint? Good point. Though the struct option isn't actually an int, it's just cargo...
    • chepner
      chepner almost 7 years
      Even if you have a terabyte of free memory, there's no need to initialize the list with zeros first; just use b = [random.randint(1,899999) for _ in range(n)].
    • juanpa.arrivillaga
      juanpa.arrivillaga almost 7 years
      @chepner well, in CPython at least, initializing with all zeros will be using the cached 0 object.
    • Admin
      Admin almost 7 years
      if not python3 use xrange and listen to the comments above, what is the purpose of initalizing a list with zero?
  • juanpa.arrivillaga
    juanpa.arrivillaga almost 7 years
    That's actually a lower limit generally. The smallest int takes that much, but int objects take variable amounts of memory, e.g. sys.getsizeof(1000000000000000000) == 32, whereas sys.getsizeof(1000) == 28,
  • Tenchi2xh
    Tenchi2xh almost 7 years
    @juanpa.arrivillaga In the author's question, the maximum attainable number is 899999, and int objects from 0 up to that number are all 24 bytes in size. (I updated the answer to show how this upper limit is calculated)
  • juanpa.arrivillaga
    juanpa.arrivillaga almost 7 years
    I'm getting 28 on my system... 64bit python 3.5, Mac OSX
  • Tenchi2xh
    Tenchi2xh almost 7 years
    Python 3 apparently has 4 more bytes of overhead compared to Python 2
  • chepner
    chepner almost 7 years
    @juanpa.arrivillaga For zero, you don't need to store any data. For anything with a larger magnitude, you need to store something.
  • chepner
    chepner almost 7 years
    Also, just to pile on, we've only been talking about the memory needed for the int objects themselves. The list is another 64 bytes plus 50,000,000,000 pointers (4-8 bytes, depending on your architecture) to reference each int object.