PyTorch Dimension out of range (expected to be in range of [-1, 0], but got 1)

18,602

It seems you are not quite using Cross Entropy Loss the way it is designed. CEL is primarily used for classification problems, where you have a probability distribution over some number of classes:

predicted = torch.tensor([[1,2,3,4]]).float()

(in this case, there are four classes, and the model is indicating its confidence of those four classes)

and then the target is simply an index indicating which class is correct:

target = torch.tensor([1]).long()

then, we can compute:

lossfxn = nn.CrossEntropyLoss()
loss = lossfxn(predicted, target)
print(loss) # outputs tensor(2.4402)

now, if we change the prediction to align with the target:

predicted = torch.tensor([[1,10,3,4]]).float()
target = torch.tensor([1]).long()
lossfxn = nn.CrossEntropyLoss()
loss = lossfxn(predicted, target)
print(loss) # outputs tensor(0.0035)

now the loss is much lower, because the prediction is correct!

Please consider the loss functions available and determine which is appropriate for your task: https://pytorch.org/docs/stable/nn.html#loss-functions (perhaps MSELoss?)

Share:
18,602
Sam
Author by

Sam

Updated on June 05, 2022

Comments

  • Sam
    Sam almost 2 years

    I have the following PyTorch tensors:

    predicted = torch.tensor([4, 4, 4, 1, 1, 1, 1, 1, 1, 4, 4, 1, 1, 1, 4, 1, 1, 4, 0, 4, 4, 1, 4, 1])
    
    target    = torch.tensor([3, 0, 0, 1, 1, 0, 1, 1, 1, 3, 2, 4, 1, 1, 1, 0, 1, 1, 2, 1, 1, 1, 1, 1,])
    

    I want to compute the Cross Entropy Loss (as part of an Logistic Regression implementation) between them with the following lines:

    loss = nn.CrossEntropyLoss()
    computed_loss = loss(predicted, target)
    

    However, when my code runs, I get the following IndexError:

    IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
    

    Any suggestions on what I'm doing wrong?

    / ##################################################################### /

    Here is the full TraceBack:

    -----------------------------------------------------------
    IndexError                Traceback (most recent call last)
    <ipython-input-208-3cdb253d6620> in <module>
          1 batch_size = 1000
          2 train_class = Train((training_set.shape[1]-1), number_of_target_labels, 0.01, 1000)
    ----> 3 train_class.train_model(training_set, batch_size)
    
    <ipython-input-207-f3e2c7f7979a> in train_model(self, training_data, n_iters)
         42                 out = self.model(x)
         43                 _, predicted = torch.max(out.data, 1)
    ---> 44                 loss = self.criterion(predicted, y)
         45                 self.optimizer.zero_grad()
         46                 loss.backward()
    
    /anaconda3/envs/malicious_ml/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
        491             result = self._slow_forward(*input, **kwargs)
        492         else:
    --> 493             result = self.forward(*input, **kwargs)
        494         for hook in self._forward_hooks.values():
        495             hook_result = hook(self, input, result)
    
    /anaconda3/envs/malicious_ml/lib/python3.6/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
        940     def forward(self, input, target):
        941         return F.cross_entropy(input, target, weight=self.weight,
    --> 942                                ignore_index=self.ignore_index, reduction=self.reduction)
        943 
        944 
    
    /anaconda3/envs/malicious_ml/lib/python3.6/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
       2054     if size_average is not None or reduce is not None:
       2055         reduction = _Reduction.legacy_get_string(size_average, reduce)
    -> 2056     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
       2057 
       2058 
    
    /anaconda3/envs/malicious_ml/lib/python3.6/site-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel, dtype)
       1348         dim = _get_softmax_dim('log_softmax', input.dim(), _stacklevel)
       1349     if dtype is None:
    -> 1350         ret = input.log_softmax(dim)
       1351     else:
       1352         ret = input.log_softmax(dim, dtype=dtype)
    

    / ##################################################################### /

    If you are interested in seeing the rest of my code, here it is:

    import torch
    import torch.nn as nn
    from torch.autograd import Variable
    
    
    class LogisticRegressionModel(nn.Module):
    
        def __init__(self, in_dim, num_classes):
            super().__init__()
            self.linear = nn.Linear(in_dim, num_classes)
    
        def forward(self, x):
            return self.linear(x)
    
    
    class Train(LogisticRegressionModel):
    
        def __init__(self, in_dim, num_classes, lr, batch_size):
            super().__init__(in_dim, num_classes)
            self.batch_size = batch_size
            self.learning_rate = lr
            self.input_layer_dim = in_dim
            self.output_layer_dim = num_classes
            self.criterion = nn.CrossEntropyLoss()
            self.model = LogisticRegressionModel(self.input_layer_dim, self.output_layer_dim)
            self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
            self.model = self.model.to(self.device)
            self.optimizer = torch.optim.SGD(self.model.parameters(), lr = self.learning_rate)  
    
        def epochs(self, iterations, train_dataset, batch_size):
            epochs = int(iterations/(len(train_dataset)/batch_size))
            return epochs
    
        def train_model(self, training_data, n_iters):
            batch = self.batch_size
            epochs = self.epochs(n_iters, training_data, batch)
            training_data = torch.utils.data.DataLoader(dataset = training_data, batch_size = batch, shuffle = True)
    
            for epoch in range(epochs):
    
                for i, data in enumerate(training_data):
    
                    X_train = data[:, :-1]
                    Y_train = data[:, -1]
    
                    if torch.cuda.is_available():
                        x = Variable(torch.Tensor(X_train).cuda())
                        y = Variable(torch.Tensor(Y_train).cuda())
    
                    else:
                        x = Variable(torch.Tensor(X_train.float()))
                        y = Variable(torch.Tensor(Y_train.float()))
    
                    out = self.model(x)
                    _, predicted = torch.max(out.data, 1)
                    loss = self.criterion(predicted, y)
                    self.optimizer.zero_grad()
                    loss.backward()
                    self.optimizer.step()
    
                    if i % 100 == 0:
                        print('[{}/{}] Loss: {:.6f}'.format(epoch + 1, epochs, loss))
    
  • Sam
    Sam almost 5 years
    Thanks for your reply, makes so much sense now. I know what I did wrong, in my full code if you look above you'll see there is a line in the train_model method of the Train class that attempts to find the maximum index of the predicted probabilities. Therefore, instead of it returning a distribution of probabilities it just returns an index of the maximum value in that array; hence, the tensor of integers. , predicted = torch.max(out.data, 1) I now get a tensor of probability distributions as you said. However, my loss is returned as nan. to be continued in the next comment....
  • Sam
    Sam almost 5 years
    When I examined the output of my model for that batch, I saw some rows with nan values, or some rows with all nan values. Does that explain why my loss is nan? What can I do about this? Should I just replace these values with 0 or the mean of that column even? Btw, this is a classification problem. I'm trying to classify urls as their type of malicious category.
  • Sam
    Sam almost 5 years
    Just tried replacing nans with zeros. Didn't work, because of the following RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn. Any suggestions on how to move forward with this?
  • thedch
    thedch almost 5 years
    Hi, it seems that your question is now more "My model is producing NaNs, how do I fix this?" -- glancing over your code I don't see an immediate answer -- for a simple nn.Linear model like this, maybe you are sending NaNs in somehow? (I don't think I have access to your data) -- maybe use pytorch.org/docs/stable/torch.html?highlight=isnan#torch.isn‌​an to check your data throughout the model? ` assert not torch.isnan(mytensor).any()` may be useful
  • Sam
    Sam almost 5 years
    Here's a link to my repo with the Jupyter Notebook and CSV. I've updated the code a little bit, but there's not much difference. Please have a look at it and let me know if you can't access it. Thanks alot. github.com/islamaymansais/Malicious-URL-Classifier/blob/mast‌​er/…
  • Sam
    Sam almost 5 years
    The dataset does contain nans upon importing, but i fill them up with zeros. That's my new update. Same error though.
  • thedch
    thedch almost 5 years
    Added issues to your github
  • Sam
    Sam almost 5 years
    Thank you so much! I really appreciate you taking the time to do so!