"RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead"?
Solution 1
As Usman Ali wrote in his comment, pytorch (and most other DL toolboxes) expects a batch of images as an input. Thus you need to call
output = model(data[None, ...])
Inserting a singleton "batch" dimension to your input data
.
Please also note that the model you are using might expect a different input size (3x229x229) and not 3x224x224.
Solution 2
From the Pytorch documentation on convolutional layers, Conv2d
layers expect input with the shape
(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)
Passing grayscale images in their usual format (224, 224) won't work.
To get the right shape, you will need to add a channel dimension. You can do it as follows:
x = np.expand_dims(x, 1) # if numpy array
tensor = tensor.unsqueeze(1) # if torch tensor
The unsqueeze()
method adds a dimensions at the specified index. The result would have shape:
(1000, 1, 224, 224)
JobHunter69
Updated on December 07, 2021Comments
-
JobHunter69 over 2 years
I am trying to use a pre-trained model. Here's where the problem occurs
Isn't the model supposed to take in a simple colored image? Why is it expecting a 4-dimensional input?
RuntimeError Traceback (most recent call last) <ipython-input-51-d7abe3ef1355> in <module>() 33 34 # Forward pass the data through the model ---> 35 output = model(data) 36 init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability 37 5 frames /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input) 336 _pair(0), self.dilation, self.groups) 337 return F.conv2d(input, self.weight, self.bias, self.stride, --> 338 self.padding, self.dilation, self.groups) 339 340 RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead
Where
inception = models.inception_v3() model = inception.to(device)
-
chavezbosquez about 4 yearsI also had to add
data[None, ...].float()
to make it work -
Shai about 4 years@chavezbosquez you should look at
.to(...)
to move/cast your input tensor into the right data type/device as expected from your model. -
Wok over 2 yearsFor grayscale images, you are right. However, for an RGB image which needs to be seen as a batch of 1 image, that would be
.unsqueeze(0)
. -
Wok over 2 yearsThe conversion
.to(device)
was needed as the input image was loaded using another mean (most likely with PIL from a WebDataSet). The value ofdevice
can be set as follows:device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
. -
Hamza usman ghani over 2 yearsCan you explain n_samples here?
-
Nicolas Gervais over 2 yearsIt's the number of training data, like the number of images