Keras Conv2D: filters vs kernel_size

15,558

Each convolution layer consists of several convolution channels (aka. depth or filters). In practice, they are a number such as 64, 128, 256, 512 etc. This is equal to number of channels in the output of a convolutional layer. kernel_size, on the other hand, is the size of these convolution filters. In practice, they take values such as 3x3 or 1x1 or 5x5. To abbreviate, they can be written as 1 or 3 or 5 as they are mostly square in practice.

Edit

Following quote should make it more clear.

Discussion on vlfeat

Suppose X is an input with size W x H x D x N (where N is the size of the batch) to a convolutional layer containing filter F (with size FW x FH x FD x K) in a network.

The number of feature channels D is the third dimension of the input X here (for example, this is typically 3 at the first input to the network if the input consists of colour images). The number of filters K is the fourth dimension of F. The two concepts are closely linked because if the number of filters in a layer is K, it produces an output with K feature channels. So the input to the next layer will have K feature channels.

The FW x FH above is filter size you are looking for.

Added

You should be familiar with filters. You can consider each filter to be responsible for extracting some type of feature from a raw image. The CNNs try to learn such filters i.e. the filters parametrized in CNNs are learned during training of CNNs. You apply each filter in a Conv2D to each input channel and combine these to get output channels. So, the number of filters and the number of output channels are the same.

Share:
15,558
Baron Yugovich
Author by

Baron Yugovich

Updated on June 06, 2022

Comments

  • Baron Yugovich
    Baron Yugovich almost 2 years

    What's the difference between those two? It would also help to explain in the more general context of convolutional networks.

    Also, as a side note, what is channels? In other words, please break down the 3 terms for me: channels vs filters vs kernel.

  • Baron Yugovich
    Baron Yugovich almost 6 years
    So channels and filters and depth all mean the same thing?
  • Baron Yugovich
    Baron Yugovich almost 6 years
    Just to make sure, FW above is a single new variable, it's not like F*W, right?
  • Dinesh
    Dinesh almost 6 years
    Yeah, FW is width of filter and FH is height of filter.
  • Baron Yugovich
    Baron Yugovich almost 6 years
    So, in the input, the 3 channels have clear meanings, R,G,B values. But in the hidden values, sometimes we have more than 3 channels, what are their meanings? Also, can a value in a single hidden neuron incorporate values from multiple input channels (e.g. both R and B), or there's no "mixing" between the input channels through the convolution process?
  • Dinesh
    Dinesh almost 6 years
    I modified above for interpretation of filters.