RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1

32,091

Solution 1

The error message clearly suggests that the error occurred at the line

loss = criterion(outputs,target)

where you are trying to compute the mean-squared error between the input and the target. See this line: criterion = nn.MSELoss().

I think you should modify your code where you are estimating loss between (output, target) pair of inputs,i.e., loss = criterion(outputs,target) to something like below:

loss = criterion(outputs,target.view(1, -1))

Here, you are making target shape same as outputs from model on line

outputs = net(data)

One more think to notice here is the output of the net model, i.e., outputs will be of shape batch_size X output_channels, where batch size if the first dimension of input images as during the training you will get batches of images, so your shape in the forward method will get an additional batch dimension at dim0: [batch_size, channels, height, width], and ouput_channels is number of output features/channels from the last linear layer in the net model.

And, the the target labels will be of shape batch_size, which is 10 in your case, check batch_size you passed in torch.utils.data.DataLoader(). Therefore, on reshaping it using view(1, -1), it will be of converted into a shape 1 X batch_size, i.e., 1 X 10.

That's why, you are getting the error:

RuntimeError: input and target shapes do not match: input [10 x 133], target [1 x 10]

So, a way around is to replace loss = criterion(outputs,target.view(1, -1)) with loss = criterion(outputs,target.view(-1, 1)) and change the output_channels of last linear layer to 1 instead of 133. In this way, both of outputs and target shape will be equal and we can compute MSE value then.

Learn more about pytorch MSE loss function from here.

Solution 2

Well, the error is because the nn.MSELoss() and nn.CrossEntropyLoss() expect different input/target combinations. You cannot simply change the criterion function without changing the inputs and targets appropriately. From the docs:

nn.CrossEntropyLoss:

Input:

(N, C) where C = number of classes, or

(N, C, d_1, d_2, ..., d_K) with K >= 1 in the case of K-dimensional loss.

Target:

(N) where each value is in range [0, C-1] or

(N, d_1, d_2, ..., d_K) with K >= 1 in the case of K-dimensional loss.

nn.MSELoss:

Input:

(N,∗) where ∗ means, any number of additional dimensions.

Target:

(N,∗), same shape as the input

As you can see, in the MSELoss, Target is expect to have the same shape as input, while in the CrossEntropyLoss, the C dimension is dropped. You cannot use MSELoss as a drop-in replacement for CrossEntropyLoss.

32,091

Author by

user11619814

Updated on July 25, 2022

Comments

user11619814 almost 2 years

I am training a CNN model. I am facing issue while doing the training iteration for my model. The code is as below:

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()

        #convo layers
        self.conv1 = nn.Conv2d(3,32,3)
        self.conv2 = nn.Conv2d(32,64,3)
        self.conv3 = nn.Conv2d(64,128,3)
        self.conv4 = nn.Conv2d(128,256,3)
        self.conv5 = nn.Conv2d(256,512,3)

        #pooling layer
        self.pool = nn.MaxPool2d(2,2)

        #linear layers
        self.fc1 = nn.Linear(512*5*5,2048)
        self.fc2 = nn.Linear(2048,1024)
        self.fc3 = nn.Linear(1024,133)

        #dropout layer
        self.dropout = nn.Dropout(0.3)
        def forward(self, x):
        #first layer
        x = self.conv1(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #second layer
        x = self.conv2(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #third layer
        x = self.conv3(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #fourth layer
        x = self.conv4(x)
        x = F.relu(x)
        x = self.pool(x)
        #fifth layer
        x = self.conv5(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)

        #reshape tensor
        x = x.view(-1,512*5*5)
        #last layer
        x = self.dropout(x)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)

        return x

        #loss func
        criterion = nn.MSELoss()
        optimizer = optim.Adam(net.parameters(), lr = 0.0001)
        #criterion = nn.CrossEntropyLoss()
        #optimizer = optim.SGD(net.parameters(), lr = 0.05)

        def train(n_epochs,model,loader,optimizer,criterion,save_path):    
           for epoch in range(n_epochs):
              train_loss = 0
              valid_loss = 0
              #training 
              net.train()
              for batch, (data,target) in enumerate(loaders['train']):
                   optimizer.zero_grad()
                   outputs = net(data)
                   #print(outputs.shape)
                   loss = criterion(outputs,target)
                   loss.backward()
                   optimizer.step()

When I use the CrossEntropy Loss function and SGD optimizer, I able able to train the model with no error. When I use MSE loss function and Adam optimizer, I am facing the following error:

RuntimeError Traceback (most recent call last) <ipython-input-20-2223dd9058dd> in <module>
      1 #train the model
      2 n_epochs = 2
----> 3 train(n_epochs,net,loaders,optimizer,criterion,'saved_model/dog_model.pt')

<ipython-input-19-a93d145ef9f7> in train(n_epochs, model, loader, optimizer, criterion, save_path)
     22 
     23             #calculate loss
---> 24             loss = criterion(outputs,target)
     25 
     26             #backward prop

RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1.

Does the selected loss function and optimizer effect the training of the model? Can anyone please help on this?

user11619814 almost 5 years

oh okay...so if I want to use MSELoss, then how do I modify my code to get the target shape same as input?
user11619814 almost 5 years

When I change the code you mentioned, I still face the same error. The size of target is now: torch.Size([10, 1]) and output size is torch.Size([10, 133]) The target size is based on the batch_size?
Anubhav Singh almost 5 years

@user11619814, can you please add full code so that I can try myself?
Anubhav Singh almost 5 years

So, here the target shape must matches the outputs shape, which is [10, 133] which completely makes sense as as your data shape is [10, 3, 224, 224] and output channels from Net model is 133. That's why [10, 133].
Anubhav Singh almost 5 years

Actually, I think I now understood what's the problem: change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1)
Berriel almost 5 years

Well, you need to drop the C dimension. To do that correctly I would need to know what this target is expected to be when using MSE.
Anubhav Singh almost 5 years

@user11619814, can you please tell me the shape of (data,target) pair in enumerate(loaders['train'])
user11619814 almost 5 years

You can find the code here: github.com/gprashmi/Dog_breed_classifier/blob/master/…
user11619814 almost 5 years

Data shape: torch.Size([10, 3, 224, 224]) target shape: torch.Size([1, 10])
user11619814 almost 5 years

If u mean the size of target, then its torch.Size([1, 10])
user11619814 almost 5 years

Can you please tell me if my understanding on target is correct. This target is the number of labels present in the dataset?
Anubhav Singh almost 5 years

target is the label in the training dataset (in batches) itself.
Anubhav Singh almost 5 years

By the way have you tried : change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1)
user11619814 almost 5 years

To change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1), isn't the second parameter of last Linear layer equal to number of classes in the dataset? the number of classes in my dataset is 133.
Anubhav Singh almost 5 years

True but this is the only way you are gonna make outputs equals to target as target is of shape [10, 1] after using .view(-1, 1). How will you convert it into 10x133, or shape of outputs?
Anubhav Singh almost 5 years

Let me try this once more.
Anubhav Singh almost 5 years

How there are 133 classes in the dataset ? Are there 133 dog breeds in your dataset ?
user11619814 almost 5 years

Yes there are 133 dog breeds in my dataset. By modifying the code to this: target = target.view(target.size(0),-1) print(target.dtype) print(target.shape) target = target.type(torch.FloatTensor), the target type is now same as outputs(float32) and error was eliminated.
user11619814 almost 5 years

Can yo please tell me if there is a specific method to calculate the training loss? I came a few formulae while looking for it 1. #calculate training loss train_loss = train_loss + ((1 / (batch + 1)) * (loss.data - train_loss)) 2. train_loss += loss.item(). Now I am confused as to if there is a way to calculate it?
Anubhav Singh almost 5 years

This is a MSE loss function:def mse_loss(input, target): return ((input - target) ** 2).sum() / input.data.nelement()
Anubhav Singh almost 5 years

Check this for custom loss function: spandan-madan.github.io/…
user11619814 almost 5 years

I have a code for predicting dog breed after training on CNN model,I am getting the class index of the breed and want to display an random image based on the class index folder obtained. When I try to display the random image, I am getting the error:Image data cannot be converted to float, I have tried displaying through imshow command, but still facing the same error. Can you please help me with this?
Anubhav Singh almost 5 years

Please check this SO solution: stackoverflow.com/questions/32302180/…
user11619814 almost 5 years

I have created a question ...with code in it....Can you please have a look at it? stackoverflow.com/questions/56862204/…