Which part of Pytorch tensor represents channels?

11,298

For a conv2D, input should be in (N, C, H, W) format. N is the number of samples/batch_size. C is the channels. H and W are height and width resp.

See shape documentation at https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d

For conv1D, input should be (N,C,L) see documentation at https://pytorch.org/docs/stable/nn.html#conv1d

Share:
11,298

Related videos on Youtube

Rohan Singh
Author by

Rohan Singh

Currently a student. I like math, computer science, physics, and astronomy. I like bike, run swim, and most of all, hike.

Updated on June 15, 2022

Comments

  • Rohan Singh
    Rohan Singh 7 months

    Surprisingly I have not found an answer to this question after looking around the internet. I am specifically interested in a 3d tensor. From doing my own experiments, I have found that when I create a tensor:

    h=torch.randn(5,12,5)
    

    And then put a convolutional layer on it defined as follows:

    conv=torch.nn.Conv1d(12,48,3,padding=1)
    

    The output is a (5,48,5) tensor. So, am I correct in assuming that for a 3d tensor in pytorch the middle number represents the number of channels?

    Edit: It seems that when running a conv2d, the input dimension is the first entry in the tensor, and I need to make it a 4d tensor (1,48,5,5) for example. Now I am very confused...

    Any help is much appreciated!

  • Rohan Singh
    Rohan Singh over 4 years
    Thank you for your answer, it seems i need to read documentation more thoroughly... :)
  • user836026
    user836026 10 months
    I have data with channel last, how do I need it to channel first.
  • Umang Gupta
    Umang Gupta 10 months
    You can use torch.permute (pytorch.org/docs/stable/generated/…). If you have 2D image of format NHWC, you would do torch.permute(x, (0, 3,1,2))