How to find k nearest neighbors to the median of n distinct numbers in O(n) time?

22,203

Solution 1

Actually, the answer is pretty simple. All we need to do is to select k elements with the smallest absolute differences from the median moving from m-1 to 0 and m+1 to n-1 when the median is at index m. We select the elements using the same idea we use in merging 2 sorted arrays.

Solution 2

No one seems to quite have this. Here's how to do it. First, find the median as described above. This is O(n). Now park the median at the end of the array, and subtract the median from every other element. Now find element k of the array (not including the last element), using the quick select algorithm again. This not only finds element k (in order), it also leaves the array so that the lowest k numbers are at the beginning of the array. These are the k closest to the median, once you add the median back in.

Solution 3

med=Select(A,1,n,n/2)   //finds the median

for i=1 to n
   B[i]=mod(A[i]-med)

q=Select(B,1,n,k) //get the kth smallest difference

j=0
for i=1 to n
   if B[i]<=q 
     C[j]=A[i] //A[i], the real value should be assigned instead of B[i] which is only the difference between A[i] and median.
       j++
return C

Solution 4

You can solve your problem like that:

You can find the median in O(n), w.g. using the O(n) nth_element algorithm.

You loop through all elements substutiting each with a pair:

the absolute difference to the median, element's value. 

Once more you do nth_element with n = k. after applying this algorithm you are guaranteed to have the k smallest elements in absolute difference first in the new array. You take their indices and DONE!

Solution 5

Four Steps:

  1. Use Median of medians to locate the median of the array - O(n)
  2. Determine the absolute difference between the median and each element in the array and store them in a new array - O(n)
  3. Use Quickselect or Introselect to pick k smallest elements out of the new array - O(k*n)
  4. Retrieve the k nearest neighbours by indexing the original array - O(k)

When k is small enough, the overall time complexity becomes O(n).

Share:
22,203
Anonymous
Author by

Anonymous

Updated on August 05, 2022

Comments

  • Anonymous
    Anonymous over 1 year

    I can use the median of medians selection algorithm to find the median in O(n). Also, I know that after the algorithm is done, all the elements to the left of the median are less that the median and all the elements to the right are greater than the median. But how do I find the k nearest neighbors to the median in O(n) time?

    If the median is n, the numbers to the left are less than n and the numbers to the right are greater than n. However, the array is not sorted in the left or the right sides. The numbers are any set of distinct numbers given by the user.

    The problem is from Introduction to Algorithms by Cormen, problem 9.3-7

  • Kirk Broadhurst
    Kirk Broadhurst over 14 years
    While true that O(k) is O(1) when k is constant, if k -> n then this becomes O(n^2). Also, how do you know k is constant? If it is, then can't n also be considered constant?
  • Admin
    Admin over 14 years
    This doesn't work. The median of median algorithms DOESNT sort the items. To do so would take O(n log n), whereas median-of-medians works on O(n).
  • glasnt
    glasnt over 14 years
    Ah, apologies. I read the original question at version 2, where he added that he already had it sorted in order.
  • Windows programmer
    Windows programmer over 14 years
    "choose k elements by reading the first k non empty values of the arrays" -- in order to do that, the arrays have to be sorted. Sorting those arrays takes time O(n log n).
  • outis
    outis over 14 years
    @Windows programmer: only if you're doing a comparison based sort.
  • S..K
    S..K over 12 years
    But how do we select them in O(n) considering that the elements are not sorted based on their absolute difference from the median ?
  • meteorgan
    meteorgan about 12 years
    as the values in array B can be equal, you should make sure j not greater than k. As the same time, if you describes your answer in text, others may understand you better.
  • SubParProgrammer
    SubParProgrammer about 10 years
    This is the same as @HalPri's answer, which was posted a year before yours.
  • iLoveCamelCase
    iLoveCamelCase over 8 years
    You should be taking moduli of numbers before find kth order statistic I guess
  • Zvika
    Zvika over 6 years
    This is better than @HalPri's answer - @Shivendra is using absoulte difference, which fixes the problem I pointed out in my comment to @HalPri's answer
  • Encipher
    Encipher about 2 years
    if I take an example of an unsorted array{5,7,3,1,9}. So the median will be 5 and median of median {7} or {1}? The link you have shared for` Quickselect` it is talking about the quicksort. There are two algorithms. Which one is for Quickselect? At step 4 you were saying by indexing the original array. Can you please explain it little bit?
  • Encipher
    Encipher about 2 years
    I have question how can I find out median? Are you referring 9.3 Selection in worst-case linear time algorithm of Cormen book? I also did not understand Once more you do nth_element with n = k. Can you please give a real time example like an array {5,7,3,1,9}. Here median is 3. So nearest neighbor is 7 and 1? Which one I need to find out here?
  • Jimmy Zhao
    Jimmy Zhao about 2 years
    @Encipher Median of median is a median-finding algorithm and we do not have the concept of medians of median for an array. Quickselect is incorporated in Quicksort's implementation but it is a separate algorithm that retrieves the kth smallest/largest number in a list. You can use Introselect as well since it has better worst-case performance. When you are making the new array, you do not change the indices of the elements. At step 4, you can resort to the results of step 3 to retrieve the k-nearest neighbours by indexing the array.