read big image file as an array in python

10,285

Solution 1

How much RAM do you have? You'll need quite a bit more than 2GB of RAM to store a 2-gig image. I don't know how efficient Image is at storing images, but a list of bytes uses four bytes of space for each element in the list, so you'll burn more than 8GB of (virtual) memory... and a lot of patience. Edit: Since you only have 4 (or 3) GB to play with, this is almost certainly your problem.

But why are you trying to convert it to a numeric array? Use the methods of the im object returned by Image.open, as in the PIL Tutorial.

I don't know what you're doing with the image, but perhaps you can do it without reading the entire image in memory, or at least without converting the entire object into a numpy array. Read it bit by bit if possible to avoid blowing up your machine: Read up on python generators, and see the Image.getdata() method, which returns your image one pixel value at a time.

Solution 2

I don't know about using numpy with the PIL, but here's how I read the pixel data into an array (this uses a .jpg image). In the past this has worked well, but I don't think I've tried this with huge images, so your problem may boil down to memory issues.

import Image
im = Image.open('pic.jpg')

pix_ar = im.load()       # load image into 2D array
red_pixel = 255, 0, 0    # a red RGB pixel

and access individual elements like this:

x = 10
y = 5
print pix_ar[x, y]
(255, 255, 255)

or assign values

pix_ar[x, y] = red_pixel

Re memory: a 2GB image may end up taking much more RAM than 2 GB once it's "unpacked" into individual pixel values, it depends on how efficient the data structures are that store this information for you once you read it into some variable/data structure. 4 GB of RAM is unlikely to be sufficient considering that you are also running the OS and some other apps concurrently while also trying to read this big file into memory.

Also, if you have successfully transferred/read files opened with the PIL to numpy in the past, the above code may not be helpful.

Share:
10,285
Vicky Liau
Author by

Vicky Liau

Growing and innovating as an experienced statistician, my studies focus on developing methodologies in improving prediction results with big imperfect datasets. The merit is adaptable to big data, requiring no statistical assumptions. My dissertation designs experiments to evaluate regression consequences due to large missing data for the guideline to avoid biased inferences. The major implication is to increase the prediction accuracy and computational efficiency with fewer costs. With enhanced datasets, shallow learning models (e.g., linear regression) are widely expected to perform better than deep learning models (e.g., CNN) with big data.

Updated on June 04, 2022

Comments

  • Vicky Liau
    Vicky Liau almost 2 years

    Does anyone know how to open a large imagery file using python? I tried to open an imagery file (about 2 GB) through windows command prompt using ipython, but it crashes every time after I change image values into an array.

    My laptop is window7-64bit with 4GB ram and Intel(R) Core(TM) i7-2860 QM CPU.

    The error message is: python.exe has stopped working A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available

    Here is my code.

    import Image
    import numpy as num
    im=Image.open('myimage.tif')
    imarray=num.array(im)