Compute distance matrix with numpy

10,160

Solution 1

Use scipy.spatial.distance.cdist. It requires 2D inputs, so you can do something like this:

from scipy.spatial import distance
dist_matrix = distance.cdist(l_arr.reshape(-1, 2), [pos_goal]).reshape(l_arr.shape[:2])

This is quite succinct, and for large arrays will be faster than a manual approach based on looping or broadcasting.

Solution 2

The np.linalg.norm function takes an axis argument, so you want:

In [6]: np.linalg.norm(l_arr - pos_goal, axis=2)
Out[6]:
array([[ 2.23606798,  2.        ,  2.23606798,  2.82842712],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 1.        ,  0.        ,  1.        ,  2.        ],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 2.23606798,  2.        ,  2.23606798,  2.82842712]])

You can just use -1 for "the last" axis:

In [7]: np.linalg.norm(l_arr - pos_goal, axis=-1)
Out[7]:
array([[ 2.23606798,  2.        ,  2.23606798,  2.82842712],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 1.        ,  0.        ,  1.        ,  2.        ],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 2.23606798,  2.        ,  2.23606798,  2.82842712]])

Note, I used array-broadcasting to get the differences:

In [11]: l_arr - pos_goal
Out[11]:
array([[[-1, -2],
        [ 0, -2],
        [ 1, -2],
        [ 2, -2]],

       [[-1, -1],
        [ 0, -1],
        [ 1, -1],
        [ 2, -1]],

       [[-1,  0],
        [ 0,  0],
        [ 1,  0],
        [ 2,  0]],

       [[-1,  1],
        [ 0,  1],
        [ 1,  1],
        [ 2,  1]],

       [[-1,  2],
        [ 0,  2],
        [ 1,  2],
        [ 2,  2]]])

Generally, learning how to use broadcasting in combination with built-in numpy/scipy vectorized functions is the way to achieve substantial speed improvements.

Share:
10,160
Mederic Fourmy
Author by

Mederic Fourmy

.

Updated on June 14, 2022

Comments

  • Mederic Fourmy
    Mederic Fourmy almost 2 years

    I am trying to compute a "distance matrix" matrix to position using numpy. To put it more clearly, I have a matrix representing positions in a 2-D grid:

    array([[[0, 0],
        [1, 0],
        [2, 0],
        [3, 0]],
    
       [[0, 1],
        [1, 1],
        [2, 1],
        [3, 1]],
    
       [[0, 2],
        [1, 2],
        [2, 2],
        [3, 2]],
    
       [[0, 3],
        [1, 3],
        [2, 3],
        [3, 3]],
    
       [[0, 4],
        [1, 4],
        [2, 4],
        [3, 4]]])
    

    I can loop over the position and compute the norm of the difference between the goal position and each position of the position matrix like this:

    pos_goal = np.array([1,2])
    dist_matrix = np.zeros(l_arr.shape[:2])
    for i, line in enumerate(l_arr):
        for j, pos in enumerate(line):
            dist_matrix[i,j] = np.linalg.norm(pos - pos_goal) 
    
    dist_matrix
    

    Result:

    array([[ 2.23606798,  2.        ,  2.23606798,  2.82842712],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 1.        ,  0.        ,  1.        ,  2.        ],
       [ 1.41421356,  1.        ,  1.41421356,  2.23606798],
       [ 2.23606798,  2.        ,  2.23606798,  2.82842712]])
    

    Isn't there a better way to do this (without the 2 loops)?

    • Zamrony P. Juhara
      Zamrony P. Juhara over 6 years
      Distance calculation is vector operation, so I think it, AFAIK, there no matrix operation that would calculate distance.