finding the distance between a set of points using scipy.spatial.distance.cdist(X, Y) in python

17,607

Two solutions:

calculate the complete matrix directly, and the access the q-th column for the values between A and B[q].

d = scipy.spatial.distance.cdist(A,B)

for q in range(len(B)):
    y = d[:,q]
    print y

If the resulting matrix is too big to hold in memory. You could do this.

for q in range(len(B)):
    y = scipy.spatial.distance.cdist(A,[B[q]])
    print y
Share:
17,607
user3287841
Author by

user3287841

Updated on November 25, 2022

Comments

  • user3287841
    user3287841 over 1 year

    I have an array of data, called A that looks something like:

    array([[0.59, 1.23], [0.89, 1.67], [0.21,0.99]...])
    

    and has about 400 sets of [x,y] points in it. I want to find the distance between every set of points in A to each sets of points in B, which is another array which looks exactly the same as A but is half the length (So about 200 sets of[x,y] points). So if I wanted to find the distance between the q-th pair of [x,y] values in B against all [x,y] values in A, I've tried doing something along the lines of

    import scipy.spatial.distance
    for q in range(0,len(B)):
        y=scipy.spatial.distance.cdist(A,B[:q,:])
    

    but I don't think this is working. I just want an output that shows the distance between the q-th row of B against all points in A.

    • M4rtini
      M4rtini about 10 years
      Is the resulting matrix too big if you calculate cdist(A,B) and then take y[:,q] for the distances for q-th item of B?
    • user3287841
      user3287841 about 10 years
      that's perfect, thanks! If you want to post as an official answer than I can mark the question as answered :)
  • Farid Alijani
    Farid Alijani about 4 years
    Which of the two solutions has less computation time?