Keras Conv2D: filters vs kernel_size
Each convolution layer consists of several convolution channels (aka. depth or filters). In practice, they are a number such as 64, 128, 256, 512
etc. This is equal to number of channels in the output of a convolutional layer. kernel_size
, on the other hand, is the size of these convolution filters. In practice, they take values such as 3x3
or 1x1
or 5x5
. To abbreviate, they can be written as 1
or 3
or 5
as they are mostly square in practice.
Edit
Following quote should make it more clear.
Suppose X
is an input with size W x H x D x N
(where N
is the size of the batch) to a convolutional layer containing filter F
(with size FW x FH x FD x K
) in a network.
The number of feature channels D
is the third dimension of the input X
here (for example, this is typically 3 at the first input to the network if the input consists of colour images).
The number of filters K
is the fourth dimension of F
.
The two concepts are closely linked because if the number of filters in a layer is K
, it produces an output with K feature channels. So the input to the next layer will have K
feature channels.
The FW x FH
above is filter size you are looking for.
Added
You should be familiar with filters. You can consider each filter to be responsible for extracting some type of feature from a raw image. The CNNs try to learn such filters i.e. the filters parametrized in CNNs are learned during training of CNNs. You apply each filter in a Conv2D to each input channel and combine these to get output channels. So, the number of filters and the number of output channels are the same.
Baron Yugovich
Updated on June 06, 2022Comments
-
Baron Yugovich almost 2 years
What's the difference between those two? It would also help to explain in the more general context of convolutional networks.
Also, as a side note, what is channels? In other words, please break down the 3 terms for me: channels vs filters vs kernel.
-
Baron Yugovich almost 6 yearsSo channels and filters and depth all mean the same thing?
-
Baron Yugovich almost 6 yearsJust to make sure, FW above is a single new variable, it's not like F*W, right?
-
Dinesh almost 6 yearsYeah, FW is width of filter and FH is height of filter.
-
Baron Yugovich almost 6 yearsSo, in the input, the 3 channels have clear meanings, R,G,B values. But in the hidden values, sometimes we have more than 3 channels, what are their meanings? Also, can a value in a single hidden neuron incorporate values from multiple input channels (e.g. both R and B), or there's no "mixing" between the input channels through the convolution process?
-
Dinesh almost 6 yearsI modified above for interpretation of filters.