What is in Keras img_to_array ? (compared to Bitmap array in C#)

10,193

img_to_array is well explained in the docstring of the Keras implementation:

def img_to_array(img, data_format='channels_last', dtype='float32'):
    """Converts a PIL Image instance to a Numpy array.
    # Arguments
        img: PIL Image instance.
        data_format: Image data format,
            either "channels_first" or "channels_last".
        dtype: Dtype to use for the returned array.
    # Returns
        A 3D Numpy array.
    # Raises
        ValueError: if invalid `img` or `data_format` is passed.
    """

So it will take a PIL Image instance and turn it into a numpy array, with dtype float32. If you start from a PNG image, the values inside the image will lie between 0 and 255. This is usually represented by an 8-bit unsigned integer; img_to_array however will cast to float. In your code example, the array is divided by 255, so that's why you end up with floats between 0 and 1.

Share:
10,193
PCG
Author by

PCG

Updated on June 14, 2022

Comments

  • PCG
    PCG about 2 years

    I am trying to understand what is in keras.preprocessing.image.img_to_array ?

    https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/keras/_impl/keras/preprocessing/image.py

    When I looked at the contents of the array they are as follows (all elements are in float ):

    image1 = img_to_array(image.load_img(ImagePath, target_size=(128,128))) / 255
    
    [0.16470588743686676, 0.3019607961177826, 0.07450980693101883], [0.1we23423423486676, 0.3023423423423423, 0.01353463453458483] ......
    

    It seems they are RGB channels of the image but why is it in fractions ?. However, if I look at the Bitmap in C#, they are in integers such as (Alpha,R,G,B)

    [100,123,024,132],[021,055,243,015].... 
    

    Could someone explain what is the difference in image array generated from img_to_array and Bitmap array formats in C# ?

    Thanks, PCG

  • PCG
    PCG over 5 years
    Great, I wish I have seen this before . . . .!