Python MemoryError on large array
Solution 1
The size of the list you are generating (which is 50 billion not 5).
An int
object instance takes 24 bytes (sys.getsizeof(int(899999))
, the upper limit of your random numbers), so that list would take 50,000,000,000 * 24 bytes
, which is about 1.09 TB.
In other words to create such a list you would need at least 1118 GB of RAM in your computer.
I don't know what your use case is, but you should consider a different approach to what you are trying to solve (maybe define a generator, or just don't store your numbers in memory and instead directly use the numbers in the for loop).
Solution 2
Since other people already answered your question here's a quick tip when dealing with big numbers: you can use "_" to separate the digits of your numbers as you wish:
n = 50_000_000_000
is the same as
n = 50000000000
but the former is much easier on the eyes
Adit Srivastava
Updated on August 16, 2022Comments
-
Adit Srivastava over 1 year
This is the python script that I'm trying to run:
n = 50000000000 ##50 billion b = [0]*n for x in range(0,n): b[x] = random.randint(1,899999)
... But the output I'm getting is:
E:\python\> python sort.py Traceback (most recent call last): File "E:\python\sort.py", line 8, in <module> b = [0]*n MemoryError
So, what do I do now?
-
Torxed almost 7 yearsYou're out of memory. The error message says so.
-
litelite almost 7 yearsYou are trying to allocate a 40GB array
-
chepner almost 7 yearsAn
int
takes at least 24 bytes; you have 5,000,000,000int
s. That's 111GB right there. -
alexis almost 7 yearsMaybe it would be more useful if you explained what you were planning to do with so many random numbers. Almost certainly there's a way to do it without blowing out your memory.
-
litelite almost 7 years@chepner ints in python seems to be 64 bits, so 8 bytes. 24 bytes would be one heck of a number
-
chepner almost 7 years@litelite No, an
int
is a Python object, representing an integer with arbitrary precision, not a machine word. -
alexis almost 7 yearsThat's right, Python has overhead in order to keep track of types etc. You want an 8-byte int, use C.
-
litelite almost 7 years@chepner Right, not a pro in python I keep forgeting about that
-
user2357112 almost 7 yearsAlso you counted your zeros wrong, so n is 50 billion, not 5 billion.
-
alexis almost 7 yearsAlso, ints in Python are not even 64 bits; Python has unlimited-length integers, welcome to the modern world.
-
juanpa.arrivillaga almost 7 years@alexis or
numpy
, orarray.array
-
alexis almost 7 yearsYou mean to limit the memory footprint? Good point. Though the
struct
option isn't actually anint
, it's just cargo... -
chepner almost 7 yearsEven if you have a terabyte of free memory, there's no need to initialize the list with zeros first; just use
b = [random.randint(1,899999) for _ in range(n)]
. -
juanpa.arrivillaga almost 7 years@chepner well, in CPython at least, initializing with all zeros will be using the cached
0
object. -
Admin almost 7 yearsif not python3 use
xrange
and listen to the comments above, what is the purpose of initalizing a list with zero?
-
-
juanpa.arrivillaga almost 7 yearsThat's actually a lower limit generally. The smallest
int
takes that much, butint
objects take variable amounts of memory, e.g.sys.getsizeof(1000000000000000000) == 32
, whereassys.getsizeof(1000) == 28
, -
Tenchi2xh almost 7 years@juanpa.arrivillaga In the author's question, the maximum attainable number is 899999, and int objects from 0 up to that number are all 24 bytes in size. (I updated the answer to show how this upper limit is calculated)
-
juanpa.arrivillaga almost 7 yearsI'm getting 28 on my system... 64bit python 3.5, Mac OSX
-
Tenchi2xh almost 7 yearsPython 3 apparently has 4 more bytes of overhead compared to Python 2
-
chepner almost 7 years@juanpa.arrivillaga For zero, you don't need to store any data. For anything with a larger magnitude, you need to store something.
-
chepner almost 7 yearsAlso, just to pile on, we've only been talking about the memory needed for the
int
objects themselves. The list is another 64 bytes plus 50,000,000,000 pointers (4-8 bytes, depending on your architecture) to reference eachint
object.