How to convert a RGB image (3 channel) to grayscale (1 channel) and save it?

python python-3.x image rgb grayscale

14,671

Solution 1

Your first code block:

import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')

This is saving the image as RGB, because cmap='gray' is ignored when supplying RGB data to imsave (see pyplot docs).

You can convert your data into grayscale by taking the average of the three bands, either using color.rgb2gray as you have, or I tend to use numpy:

import numpy as np
from matplotlib import pyplot as plt
import cv2

img_rgb = np.random.rand(196,256,3)
print('RGB image shape:', img_rgb.shape)

img_gray = np.mean(img_rgb, axis=2)
print('Grayscale image shape:', img_gray.shape)

Output:

RGB image shape: (196, 256, 3)
Grayscale image shape: (196, 256)

img_gray is now the correct shape, however if you save it using plt.imsave, it will still write three bands, with R == G == B for each pixel. This is because, I believe, a PNG file requires three (or four) bands. Warning: I am not sure about this: I expect to be corrected.

plt.imsave('image_gray.png', img_gray, format='png')
new_img = cv2.imread('image_gray.png')
print('Loaded image shape:', new_img.shape)

Output:

Loaded image shape: (196, 256, 3)

One way to avoid this is to save the images as numpy files, or indeed to save a batch of images as numpy files:

np.save('np_image.npy', img_gray)
new_np = np.load('np_image.npy')
print('new_np shape:', new_np.shape)

Output:

new_np shape: (196, 256)

The other thing you could do is save the grayscale png (using imsave) but then only read in the first band:

finalimg = cv2.imread('image_gray.png',0)
print('finalimg image shape:', finalimg.shape)

Output:

finalimg image shape: (196, 256)

Solution 2

As it turns out, Keras, the deep-learning library I'm using has its own method of converting images to a single color channel (grayscale) in its image pre-processing step.

When using the ImageDataGenerator class the flow_from_directory method takes the color_mode argument. Setting color_mode = "grayscale" will automatically convert the PNG into a single color channel!

https://keras.io/preprocessing/image/#imagedatagenerator-methods

Hope this helps someone in the future.

14,671

Author by

J. Devez

Updated on June 18, 2022

Comments

J. Devez almost 2 years
Working with a deep learning project and I have a lot of images, that don't need to have colors. I saved them doing:
```
import matplotlib.pyplot as plt

plt.imsave('image.png', image, format='png', cmap='gray')
```
However later when I checked the shape of the image the result is:
```
import cv2
img_rgb = cv2.imread('image.png')
print(img_rgb.shape)
(196,256,3)
```
So even though the image I view is in grayscale, I still have 3 color channels. I realized I had to do some algebric operations in order to convert those 3 channels into 1 single channel.

I have tried the methods described on the thread "How can I convert an RGB image into grayscale in Python?" but I'm confused.

For example, when to do the conversion using:
```
from skimage import color
from skimage import io
img_gray = color.rgb2gray(io.imread('image.png'))
plt.imsave('image_gray.png', img_gray, format='png')
```
However when I load the new image and check its shape:
```
img_gr = cv2.imread('image_gray.png')
print(img_gr.shape)
(196,256,3)
```
I tried the other methods on that thread but the results are the same. My goal is to have images with a (196,256,1) shape, given how much less computationally intensive it will be for a Convolutional Neural Network.

Any help would be appreciated.
Håken Lid over 5 years

PNG does not require three bands. RBGA is very common, and grayscale or grayscale with alpha is also possible.
jmsinusa over 5 years

@HåkenLid I'd appreciate an example of how to save just one band to a PNG.
Håken Lid over 5 years

The official PNG spec supports it. w3.org/TR/2003/REC-PNG-20031110 "Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. Sample depths range from 1 to 16 bits.". But it's entirely possible that there are graphics libraries that don't fully support all parts of the PNG standard.
J. Devez over 5 years

Thank you this is very useful! I think a good course of action to take would then be: 1 - to load the images as they are (3 color channel) 2 - load the data and convert it to 1 color channel 3 - save it as numpy file 4 - feed that into the neural network The only problem I can anticipate with this is that I think the expected input will be (196,256,1) and the one I will have is a (196,256). Are these equivalent?
jmsinusa over 5 years

You can use numpy.expand_dims(array, 2) on your input to add the extra expected dimension.
Shilan over 3 years

The problem is I am not reading from disk! I am taking image from camera and I want to convert it immediately to grayscale without saving it on disk!