Subtract mean from image

python image-processing machine-learning neural-network

11,610

Solution 1

You should read paper carefully, but what is the most probable is that they mean mean of the patches, so you have N matrices 61x61 pixels, which is equivalent of a vector of length 61^2 (if there are three channels then 3*61^2). What they do - they simple compute mean of each dimension, so they calculate mean over these N vectors in respect to each of the 3*61^2 dimensions. As the result they obtain a mean vector of length 3*61^2 (or mean matrix/mean patch if you prefer) and they substract it from all of these N patches. Resulting patches will have negatives values, it is perfectly fine, you should not take abs value, neural networks prefer this kind of data.

Solution 2

I would assume the mean mentioned in the paper is the mean over all images used in the training set (computed separately for each channel).

Several indications:

Caffe is a lib for ConvNets. In their tutorial they mention the compute image mean part: http://caffe.berkeleyvision.org/gathered/examples/imagenet.html For this they use the following script: https://github.com/BVLC/caffe/blob/master/examples/imagenet/make_imagenet_mean.sh which does what I indicated.
Google played around with ConvNets and published their code here: https://github.com/google/deepdream/blob/master/dream.ipynb and they do also use the mean of the training set.

This is of course only indirect evidence since I can not explain you why this happens. In fact I stumbled over this question while trying to figure out precisely that.

//EDIT:

In the mean time I found a source confirming my claim (Highlighting added by me):

There are three common forms of data preprocessing a data matrix X [...]

Mean subtraction is the most common form of preprocessing. It involves subtracting the mean across every individual feature in the data, and has the geometric interpretation of centering the cloud of data around the origin along every dimension. In numpy, this operation would be implemented as: X -= np.mean(X, axis = 0). With images specifically, for convenience it can be common to subtract a single value from all pixels (e.g. X -= np.mean(X)), or to do so separately across the three color channels.

As we can see, the whole data is used to compute the mean.

11,610

sakuragi

Updated on September 14, 2022

Comments

sakuragi over 1 year
I'm implementing a CNN with Theano. In the paper, I have to do this image preprocess before train the CNN
```
We extracted RGB patches of 61x61 dimensions associated with each poselet activation, subtracted the mean and used this data to train the convnet model shown in Table 1
```
Can you tell me what does it mean with "subtracted the mean"? Tell me if these steps are correct (it is what I understood) 1) Compute the mean for Red Channel, Green Channel and Blue Channel for the whole image 2) For each pixel, subtract from red value the mean of red channel, from green value the mean of green channel and the same for the blue channel 3) Is it correct to have negative value or do I have use the abs?

Thanks all!!
sakuragi about 9 years

The papers is this one arxiv.org/pdf/1407.0717v1.pdf I have 6 milion of images and I don't think that the mentioned mean is about the patches, but about the single image. Of course it is possible your idea
Zangetsu over 8 years

Does it make sense to use mean over all images in training set? We should be doing it for each pixel of any individual image's RGB value.
Zakum over 8 years

I provided a link and a citation as for the sense of using the whole data set in my edit.