K-d trees: nearest neighbor search algorithm

11,259

Solution 1

Step 4 is the 'else' in step 3, what you do if the plane is closer than the point. Just because the point you found would be in the same rectangle as the point you are finding the neighbour for doesn't mean that it is the closest.

Imagine the following scenario: you have two points in your kD-Tree, A and B. A is in the middle of its rectangle, while B is just over the edge, in the partitioned area next to that of A. If you now search for the nearest neighbour to point C, which is right next to B but happens to be the other side of the edge and in the partition area of A, your first point you choose will be A due to the initial Depth First Search that chooses whatever would be in the same partition as your search point. However, B is actually closer, so even though you chose A, you need to check whether B is closer otherwise your kD-Tree won't actually give you correct results.

A good way of visualising this is to draw it out:

A-------------C--|--B

A is the first point we found in the DFS, C is our point we want the nearest neighbour of, B is the actual nearest neighbour, | is our split plane.

Another way to think of it is to draw a circle with radius dist(A,C) around point C. If any other rectangles have any portion of themselves fall within this circle, then there is a chance that they hold a point which might be closer to C than A is, so they must be checked. If you now find B, you can reduce the radius of your circle (because B is closer) so that less rectangles have a chance of intersecting, and once you have checked all the rectangles which intersect with your circle (reducing your circle radius as your find closer neighbours) you can definitively say that there are no closer points.

Solution 2

I wrote a basic C++ implementation on github. It has both an iterative and recursive version.

Share:
11,259
Kaiser Octavius
Author by

Kaiser Octavius

Updated on June 19, 2022

Comments

  • Kaiser Octavius
    Kaiser Octavius almost 2 years

    This is my understanding of it: 1. Recurse down the tree, taking the left or right subtree according as whether ELEMENT would lie in the left or the right subtree, if it existed. 2. Set CURRENT_BEST as the first leaf node that you reach. 3. As you recurse back up, check to see whether ELEMENT lies closer to the splitting hyperplane than it does to CURRENT_BEST. If so, set CURRENT_BEST as the current node.

    This is the part I got from Wikipedia and my class, and the part I don't understand: 4. Check to see whether any node in the other subtree of the splitting point singled out in 3. is closer to ELEMENT than the splitting point.

    I don't see why we need to do 4., since any point that might lie in the one subtree of the splitting node must necessarily be closer to the splitting node than to any point in the other subtree.

    It's obviously my understanding of the algorithm that is flawed, so help will be greatly appreciated.

  • Jernej Jerin
    Jernej Jerin about 11 years
    This algorithm describes constructing k-d tree not searching for nearest neighbor.