How to convert a pytorch tensor into a numpy array?

43,833

Solution 1

copied from pytorch doc:

a = torch.ones(5)
print(a)

tensor([1., 1., 1., 1., 1.])

b = a.numpy()
print(b)

[1. 1. 1. 1. 1.]


Following from the below discussion with @John:

In case the tensor is (or can be) on GPU, or in case it (or it can) require grad, one can use

t.detach().cpu().numpy()

I recommend to uglify your code only as much as required.

Solution 2

You can try following ways

1. torch.Tensor().numpy()
2. torch.Tensor().cpu().data.numpy()
3. torch.Tensor().cpu().detach().numpy()

Solution 3

Another useful way :

a = torch(0.1, device='cuda')

a.cpu().data.numpy()

Answer

array(0.1, dtype=float32)

Solution 4

This is a function from fastai core:

def to_np(x):
    "Convert a tensor to a numpy array."
    return apply(lambda o: o.data.cpu().numpy(), x)

Possible using a function from prospective PyTorch library is a nice choice.

If you look inside PyTorch Transformers you will find this code:

preds = logits.detach().cpu().numpy()

So you may ask why the detach() method is needed? It is needed when we would like to detach the tensor from AD computational graph.

Still note that the CPU tensor and numpy array are connected. They share the same storage:

import torch
tensor = torch.zeros(2)
numpy_array = tensor.numpy()
print('Before edit:')
print(tensor)
print(numpy_array)

tensor[0] = 10

print()
print('After edit:')
print('Tensor:', tensor)
print('Numpy array:', numpy_array)

Output:

Before edit:
tensor([0., 0.])
[0. 0.]

After edit:
Tensor: tensor([10.,  0.])
Numpy array: [10.  0.]

The value of the first element is shared by the tensor and the numpy array. Changing it to 10 in the tensor changed it in the numpy array as well.

This is why we need to be careful, since altering the numpy array my alter the CPU tensor as well.

Solution 5

You may find the following two functions useful.

  1. torch.Tensor.numpy()
  2. torch.from_numpy()
Share:
43,833
Gulzar
Author by

Gulzar

Born 1990, Bsc graduate from Technion, Israel Institute of Technology, algorithm engineer at UVEye. Enthusiastic about image/signal processing, and machine learning. Also about reinforcement learning for control problems. The most important programming lesson I ever listened to.

Updated on March 30, 2021

Comments

  • Gulzar
    Gulzar about 3 years

    I have a torch tensor

    a = torch.randn(1, 2, 3, 4, 5)
    

    How can I get it in numpy?

    Something like

    b = a.tonumpy()
    

    output should be the same as if I did

    b = np.random.randn(1, 2, 3, 4, 5)
    
  • Lars Ericson
    Lars Ericson over 4 years
    In my copy of torch better make that a.detach().cpu().numpy()
  • Gulzar
    Gulzar over 4 years
    @LarsEricson why?
  • Sid
    Sid over 3 years
    what would the complexity be to convert tensor to NumPy like this?
  • Gulzar
    Gulzar over 3 years
  • John
    John over 3 years
    @Gulzar the detach() is necessary to avoid computing gradients and the cpu() is necessary if the Tensor is in gpu memory
  • Gulzar
    Gulzar over 3 years
    @John this is not necessary in the general case.
  • John
    John over 3 years
    @Gulzar can you be more specific? What isn't necessary (there were 2 functions listed) and what do you mean by "in the general case?" consider this question/answer for more detail: stackoverflow.com/questions/63582590/…
  • Gulzar
    Gulzar over 3 years
    @John The usage of detach and cpu is understood. The question here was about converting a torch tensor to a numpy array. numpy arrays are always on cpu, and always don't have gradient calculation involved with them. If the torch tensor is on gpu or needs to be detached, that is beside the point of this question. So, always doing .detach().cpu() is overkill. Doing it when necessary is good, but this isn't the general case.
  • John
    John over 3 years
    This is true, although I believe both are noops if unnecessary so the overkill is only in the typing and there's some value if writing a function that accepts a Tensor of unknown provenance. I apologize for misunderstanding your original question to Lars. To summarize, detach and cpu are not necessary in every case, but are necessary in perhaps the most common case (so there's value in mentioning them). numpy is necessary in every case but is often insufficient on its own. Any future persons should reference the question linked above or the pytorch documentation for more information.
  • omsrisagar
    omsrisagar about 2 years
    What is the benefit of including .data. ?