What's the maximum size of a numpy array?

79,009

Solution 1

You're trying to create an array with 2.7 billion entries. If you're running 64-bit numpy, at 8 bytes per entry, that would be 20 GB in all.

So almost certainly you just ran out of memory on your machine. There is no general maximum array size in numpy.

Solution 2

A ValueError indicates the size is too big for allocation, not that there is not enough memory. On my laptop, using 64bits python, I can allocate it if I reduce the number of bits:

In [16]: a=np.arange(2708000000)
---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-16-aaa1699e97c5> in <module>()
----> 1 a=np.arange(2708000000)

MemoryError: 

# Note I don't get a ValueError

In [17]: a = np.arange(2708000000, dtype=np.int8)

In [18]: a.nbytes
Out[18]: 2708000000

In [19]: a.nbytes * 1e-6
Out[19]: 2708.0

In your case, arange uses int64 bits, which means it is 16 times more, or around 43 GB. a 32 bits process can only access around 4 GB of memory.

The underlying reason is the size of the pointers used to access data and how many numbers you can represent with those bits:

In [26]: np.iinfo(np.int32)
Out[26]: iinfo(min=-2147483648, max=2147483647, dtype=int32)
In [27]: np.iinfo(np.int64)
Out[27]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

Note that I can replicate your ValueError if I try to create an absurdly large array:

In [29]: a = np.arange(1e350)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-230a6916f777> in <module>()
----> 1 a = np.arange(1e350)

ValueError: Maximum allowed size exceeded

If your machine has a lot of memory, as you said, it will be 64 bits, so you should install Python 64 bits to be able to access it. On the other hand, for such big datasets, you should consider the possibility of using out of core computations.

Solution 3

I was able to create an array with a size of 6Billion that ate up 45GB of memory. By default, numpy created the array with a dtype of float64. By dropping the precision, I was able to save a lot of memory.

np.arange(6000000000,dtype=np.dtype('f8'))
np.arange(6000000000,dtype=np.dtype('f4'))
#etc...

default == float64

  • np.float64 -- 45.7GB

  • np.float32 -- 22.9GB

  • np.int8 -- 5.7GB

Obviously a 8bit integer cant store a value of 6B. I'm sure a max size exists at some point but I suspect it's FAR past anything possible in 2016. Interestingly, "Python Blaze" allows you to create numpy arrays on disk. I recall playing with it some time ago and creating an extremely large array that took up 1TB of disk.

Share:
79,009
branwen85
Author by

branwen85

Updated on July 18, 2020

Comments

  • branwen85
    branwen85 almost 4 years

    I'm trying to create a matrix containing 2 708 000 000 elements. When I try to create a numpy array of this size it gives me a value error. Is there any way I can increase the maximum array size?

    a=np.arange(2708000000)

    ValueError Traceback (most recent call last)

    ValueError: Maximum allowed size exceeded

  • seberg
    seberg over 11 years
    Of course there is, it is the size of np.intp datatype. Which for 32bit versions may only be 32bits... Of course that will correlate for almost all practical purposes with out of memory...
  • branwen85
    branwen85 over 11 years
    This example was indeed caused by memory issue. However, I run my analysis on a very large memory machine and I keep getting errors... I will create a new question with the actual work problem.
  • Davidmh
    Davidmh almost 10 years
    No one is talking about strings or lists. Numpy arrays are a C object, with a fixed Python overhead of 80 bytes (on my machine).
  • Davidmh
    Davidmh almost 10 years
    If you try to create an array of 1e30 elements, it will raise an error before even trying to allocate the memory. If you try to allocate many smaller arrays until there is no space left, then you would get a MemoryError.
  • KarlFG
    KarlFG almost 10 years
    Yes,but it is also a problem of the limitation of size, arrange is to generate an array, and the asker's problem is it exceeds the limit size of element number.
  • Abdalla Essam Ali
    Abdalla Essam Ali over 5 years
    can you provide an example of creating such large arrays using Python Blaze?
  • J'e
    J'e over 5 years
    @AbdallaEssamAli it's been a long time since I messed with this. On the blaze docs (blaze.readthedocs.io/en/latest/index.html) there is a reference to "dask.array" (dask.org) which has some example code on the main page. 2 lines of code including the import will get you an example to start with.
  • Giuppox
    Giuppox over 2 years
    I'd just like to point out that since, as @seberg said, the maximum size is np.intp, it coincides with the size of C's ssize_t on your machine.