RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1

32,091

Solution 1

The error message clearly suggests that the error occurred at the line

loss = criterion(outputs,target)

where you are trying to compute the mean-squared error between the input and the target. See this line: criterion = nn.MSELoss().

I think you should modify your code where you are estimating loss between (output, target) pair of inputs,i.e., loss = criterion(outputs,target) to something like below:

loss = criterion(outputs,target.view(1, -1))

Here, you are making target shape same as outputs from model on line

outputs = net(data)

One more think to notice here is the output of the net model, i.e., outputs will be of shape batch_size X output_channels, where batch size if the first dimension of input images as during the training you will get batches of images, so your shape in the forward method will get an additional batch dimension at dim0: [batch_size, channels, height, width], and ouput_channels is number of output features/channels from the last linear layer in the net model.

And, the the target labels will be of shape batch_size, which is 10 in your case, check batch_size you passed in torch.utils.data.DataLoader(). Therefore, on reshaping it using view(1, -1), it will be of converted into a shape 1 X batch_size, i.e., 1 X 10.

That's why, you are getting the error:

RuntimeError: input and target shapes do not match: input [10 x 133], target [1 x 10]

So, a way around is to replace loss = criterion(outputs,target.view(1, -1)) with loss = criterion(outputs,target.view(-1, 1)) and change the output_channels of last linear layer to 1 instead of 133. In this way, both of outputs and target shape will be equal and we can compute MSE value then.

Learn more about pytorch MSE loss function from here.

Solution 2

Well, the error is because the nn.MSELoss() and nn.CrossEntropyLoss() expect different input/target combinations. You cannot simply change the criterion function without changing the inputs and targets appropriately. From the docs:

nn.CrossEntropyLoss:

  • Input:
    • (N, C) where C = number of classes, or
    • (N, C, d_1, d_2, ..., d_K) with K >= 1 in the case of K-dimensional loss.
  • Target:
    • (N) where each value is in range [0, C-1] or
    • (N, d_1, d_2, ..., d_K) with K >= 1 in the case of K-dimensional loss.

nn.MSELoss:

  • Input:
    • (N,∗) where ∗ means, any number of additional dimensions.
  • Target:
    • (N,∗), same shape as the input

As you can see, in the MSELoss, Target is expect to have the same shape as input, while in the CrossEntropyLoss, the C dimension is dropped. You cannot use MSELoss as a drop-in replacement for CrossEntropyLoss.

Share:
32,091
user11619814
Author by

user11619814

Updated on July 25, 2022

Comments

  • user11619814
    user11619814 almost 2 years

    I am training a CNN model. I am facing issue while doing the training iteration for my model. The code is as below:

    class Net(nn.Module):
    
        def __init__(self):
            super(Net, self).__init__()
    
            #convo layers
            self.conv1 = nn.Conv2d(3,32,3)
            self.conv2 = nn.Conv2d(32,64,3)
            self.conv3 = nn.Conv2d(64,128,3)
            self.conv4 = nn.Conv2d(128,256,3)
            self.conv5 = nn.Conv2d(256,512,3)
    
            #pooling layer
            self.pool = nn.MaxPool2d(2,2)
    
            #linear layers
            self.fc1 = nn.Linear(512*5*5,2048)
            self.fc2 = nn.Linear(2048,1024)
            self.fc3 = nn.Linear(1024,133)
    
            #dropout layer
            self.dropout = nn.Dropout(0.3)
            def forward(self, x):
            #first layer
            x = self.conv1(x)
            x = F.relu(x)
            x = self.pool(x)
            #x = self.dropout(x)
            #second layer
            x = self.conv2(x)
            x = F.relu(x)
            x = self.pool(x)
            #x = self.dropout(x)
            #third layer
            x = self.conv3(x)
            x = F.relu(x)
            x = self.pool(x)
            #x = self.dropout(x)
            #fourth layer
            x = self.conv4(x)
            x = F.relu(x)
            x = self.pool(x)
            #fifth layer
            x = self.conv5(x)
            x = F.relu(x)
            x = self.pool(x)
            #x = self.dropout(x)
    
            #reshape tensor
            x = x.view(-1,512*5*5)
            #last layer
            x = self.dropout(x)
            x = self.fc1(x)
            x = F.relu(x)
            x = self.dropout(x)
            x = self.fc2(x)
            x = F.relu(x)
            x = self.fc3(x)
    
            return x
    
            #loss func
            criterion = nn.MSELoss()
            optimizer = optim.Adam(net.parameters(), lr = 0.0001)
            #criterion = nn.CrossEntropyLoss()
            #optimizer = optim.SGD(net.parameters(), lr = 0.05)
    
            def train(n_epochs,model,loader,optimizer,criterion,save_path):    
               for epoch in range(n_epochs):
                  train_loss = 0
                  valid_loss = 0
                  #training 
                  net.train()
                  for batch, (data,target) in enumerate(loaders['train']):
                       optimizer.zero_grad()
                       outputs = net(data)
                       #print(outputs.shape)
                       loss = criterion(outputs,target)
                       loss.backward()
                       optimizer.step()
    

    When I use the CrossEntropy Loss function and SGD optimizer, I able able to train the model with no error. When I use MSE loss function and Adam optimizer, I am facing the following error:

    RuntimeError Traceback (most recent call last) <ipython-input-20-2223dd9058dd> in <module>
          1 #train the model
          2 n_epochs = 2
    ----> 3 train(n_epochs,net,loaders,optimizer,criterion,'saved_model/dog_model.pt')
    
    <ipython-input-19-a93d145ef9f7> in train(n_epochs, model, loader, optimizer, criterion, save_path)
         22 
         23             #calculate loss
    ---> 24             loss = criterion(outputs,target)
         25 
         26             #backward prop
    
    RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1.
    

    Does the selected loss function and optimizer effect the training of the model? Can anyone please help on this?

  • user11619814
    user11619814 almost 5 years
    oh okay...so if I want to use MSELoss, then how do I modify my code to get the target shape same as input?
  • user11619814
    user11619814 almost 5 years
    When I change the code you mentioned, I still face the same error. The size of target is now: torch.Size([10, 1]) and output size is torch.Size([10, 133]) The target size is based on the batch_size?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    @user11619814, can you please add full code so that I can try myself?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    So, here the target shape must matches the outputs shape, which is [10, 133] which completely makes sense as as your data shape is [10, 3, 224, 224] and output channels from Net model is 133. That's why [10, 133].
  • Anubhav Singh
    Anubhav Singh almost 5 years
    Actually, I think I now understood what's the problem: change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1)
  • Berriel
    Berriel almost 5 years
    Well, you need to drop the C dimension. To do that correctly I would need to know what this target is expected to be when using MSE.
  • Anubhav Singh
    Anubhav Singh almost 5 years
    @user11619814, can you please tell me the shape of (data,target) pair in enumerate(loaders['train'])
  • user11619814
    user11619814 almost 5 years
  • user11619814
    user11619814 almost 5 years
    Data shape: torch.Size([10, 3, 224, 224]) target shape: torch.Size([1, 10])
  • user11619814
    user11619814 almost 5 years
    If u mean the size of target, then its torch.Size([1, 10])
  • user11619814
    user11619814 almost 5 years
    Can you please tell me if my understanding on target is correct. This target is the number of labels present in the dataset?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    target is the label in the training dataset (in batches) itself.
  • Anubhav Singh
    Anubhav Singh almost 5 years
    By the way have you tried : change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1)
  • user11619814
    user11619814 almost 5 years
    To change self.fc3 = nn.Linear(1024,133) to self.fc3 = nn.Linear(1024,1), isn't the second parameter of last Linear layer equal to number of classes in the dataset? the number of classes in my dataset is 133.
  • Anubhav Singh
    Anubhav Singh almost 5 years
    True but this is the only way you are gonna make outputs equals to target as target is of shape [10, 1] after using .view(-1, 1). How will you convert it into 10x133, or shape of outputs?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    Let me try this once more.
  • Anubhav Singh
    Anubhav Singh almost 5 years
    How there are 133 classes in the dataset ? Are there 133 dog breeds in your dataset ?
  • user11619814
    user11619814 almost 5 years
    Yes there are 133 dog breeds in my dataset. By modifying the code to this: target = target.view(target.size(0),-1) print(target.dtype) print(target.shape) target = target.type(torch.FloatTensor), the target type is now same as outputs(float32) and error was eliminated.
  • user11619814
    user11619814 almost 5 years
    Can yo please tell me if there is a specific method to calculate the training loss? I came a few formulae while looking for it 1. #calculate training loss train_loss = train_loss + ((1 / (batch + 1)) * (loss.data - train_loss)) 2. train_loss += loss.item(). Now I am confused as to if there is a way to calculate it?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    This is a MSE loss function:def mse_loss(input, target): return ((input - target) ** 2).sum() / input.data.nelement()
  • Anubhav Singh
    Anubhav Singh almost 5 years
    Check this for custom loss function: spandan-madan.github.io/…
  • user11619814
    user11619814 almost 5 years
    I have a code for predicting dog breed after training on CNN model,I am getting the class index of the breed and want to display an random image based on the class index folder obtained. When I try to display the random image, I am getting the error:Image data cannot be converted to float, I have tried displaying through imshow command, but still facing the same error. Can you please help me with this?
  • Anubhav Singh
    Anubhav Singh almost 5 years
    Please check this SO solution: stackoverflow.com/questions/32302180/…
  • user11619814
    user11619814 almost 5 years
    I have created a question ...with code in it....Can you please have a look at it? stackoverflow.com/questions/56862204/…