Convert image to array for CNN

14,731

For CNN, your input must be a 4-D tensor [batch_size, width, height, channels], so each image is a 3-D sub-tensor. Since your images are gray-scale, channels=1. Also for training all images must be of the same size - WIDTH and HEIGHT.

The skimage.io.imread is returning an ndarray, and this works perfectly for keras. So you can read the data like this:

all_images = []
for image_path in os.listdir(path):
  img = io.imread(image_path , as_grey=True)
  img = img.reshape([WIDTH, HEIGHT, 1])
  all_images.append(img)
x_train = np.array(all_images)

Not sure how you store the labels, but you'll need to make an array of labels as well. I call it y_train. You can convert it to one-hot like this:

y_train = keras.utils.to_categorical(y_train, num_classes)

The model in keras is pretty straighforward, here's the simplest one (uses relu and x-entropy):

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', 
                 input_shape=[WIDTH, HEIGHT, 1]))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=100, epochs=10, verbose=1)

A complete working MNIST example can be found here.

Share:
14,731
Marios Nikolaou
Author by

Marios Nikolaou

Updated on June 15, 2022

Comments

  • Marios Nikolaou
    Marios Nikolaou almost 2 years

    I am trying to categorize the dog breeding identification using CNN. I have converted the images to gray scale and re-scaled them in order to be smaller in size. So now I am trying to add them in numpy array and do the training. Also I will use Relu activation function because it performs well with multi layer and a categorical cross entropy for the different categories of dog breeding.

    Below is the code for grayscale and re-scale:

    def RescaleGrayscaleImg():
    
        # iterate through the names of contents of the folder
        for image_path in os.listdir(path):
    
            # create the full input path and read the file
            input_path = os.path.join(path, image_path)
    
            # make image grayscale
            img = io.imread(input_path)
            img_scaled = rescale(img, 2.0 / 4.0)
            GrayImg = color.rgb2gray(img_scaled)
    
            # create full output path, 'example.jpg' 
            # becomes 'grayscaled_example.jpg', save the file to disk
            fullpath = os.path.join(outPath, 'grayscaled_'+image_path)
            misc.imsave(fullpath, GrayImg)
    

    How I will convert the images to array? Will each column be an image?