How do I traverse a KDTree to find k nearest neighbors?

10,080

Solution 1

You can maintain a max heap of size k (k is the count of nearest neighbors which we wanted to find).

Start from the root node and insert the distance value in the max heap node. Keep on searching in k-d tree using dimensional splitting , criteria and keep updating Max Heap tree.

https://gopalcdas.wordpress.com/2017/05/24/construction-of-k-d-tree-and-using-it-for-nearest-neighbour-search/

~Ashish

Solution 2

Adding to @Ashish's answer, you can use a max-heap in the following manner:

1) Build a max-heap of the first k elements (arr[0] to arr[k-1]) of the given array. 

This step is O(k). Then

2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with 
   root of the max-heap.
    a) If the element is smaller than the root then make it root 
       and call heapify for max-heap.
    b) Else ignore it.

The step 2 is O((n-k)*log(k)).

3) Finally, the max-heap has k smallest elements and root of the heap 
   is the kth smallest element.

Time Complexity: O(k + (n-k)*log(k)) without sorted output. If sorted output is needed then O(k + (n-k)*log(k) + k*log(k)).

Share:
10,080
user2647513
Author by

user2647513

Updated on June 08, 2022

Comments

  • user2647513
    user2647513 almost 2 years

    This question concerns the implementation of KNN searching of KDTrees. Traversal of a KDTree to find a single best match (nearest neighbor) is straightforward, akin to a modified binary search.

    How is the traversal modified to exhaustively and efficiently find k-best matches (KNN)?

    Edit for clarification: After finding the nearest node M to the input query I, how does the traversal algorithm continue to find the remaining K-1 closest matches to the query? Is there a traversal pattern which guarantees that nodes are visited in order of best to worst match to the query?