O(klogk) time algorithm to find kth smallest element from a binary heap
Solution 1
Well, your intuition was right that we need extra data structure to achieve O(klogk) because if we simply perform operations on the original heap, the term logn will remain in the resulting complexity.
Guessing from the targeted complexity O(klogk), I feel like creating and maintaining a heap of size k to help me achieve the goal. As you may be aware, building a heap of size k in top-down fashion takes O(klogk), which really reminds me of our goal.
The following is my try (not necessarily elegant or efficient) in an attempt to attain O(klogk):
We create a new min heap, initializing its root to be the root of the original heap.
We update the new min heap by deleting the current root and inserting the two children of the current root in the original heap. We repeat this process k times.
The resulting heap will consist of k nodes, the root of which is the kth smallest element in the original heap.
Notes: Nodes in the new heap should store indexes of their corresponding nodes in the original heap, rather than the node values themselves. In each iteration of step 2, we really add a net of one more node into the new heap (one deleted, two inserted), k iterations of which will result in our new heap of size k. During the ith iteration, the node to be deleted is the ith smallest element in the original heap.
Time Complexity: in each iteration, it takes O(3logk) time to delete one element from and insert two into the new heap. After k iterations, it is O(3klogk) = O(klogk).
Hope this solution inspires you a bit.
Solution 2
Assuming that we're using a minheap, so that a root node is always smaller than its children nodes.
Create a sorted list toVisit, which contains the nodes which we will traverse next. This is initially just the root node.
Create an array smallestNodes. Initially this is empty.
While length of smallestNodes < k:
Remove the smallest Node from toVisit
add that node to smallestNodes
add that node's children to toVisit
When you're done, the kth smallest node is in smallestNodes[k-1].
Depending on the implementation of toVisit, you can get insertion in log(k) time and removal in constant time (since you're only removing the topmost node). That makes O(k*log(k)) total.
Admin
Updated on February 03, 2020Comments
-
Admin over 4 years
We have an n-node binary heap which contains
n
distinct items (smallest item at the root). For ak<=n
, find aO(klogk)
time algorithm to selectkth
smallest element from the heap.O(klogn)
is obvious, but couldn't figure out aO(klogk)
one. Maybe we can use a second heap, not sure. -
Jim Mischel over 12 yearsInsertion isn't
log(k)
, but ratherlog(n)
, wheren
is the number of nodes already in the heap. Insertingk
nodes will bek*log(n)
. -
amit over 12 years@JimMischel: no, in
toVisit
there are no more then2k
nodes at any point [since we add 2 elements for each element we remove, and we do itk
times], so the insertion and deletion fromtoVisit
isO(log2k) = O(logk)
. for each operation on the original list, we just extract the direct children of a specific node, which isO(1)
. we overall dok
timesO(logk)
ops, which is indeedO(klogk)
. -
amit over 12 yearsthough a
sorted list
is not a good data structure fortoVisit
, since insertion isO(k)
in this list. You will need a heap to actually obtainO(klogk)
[skip list/ balanced BST/B+ tree are also valid options, though harder to implement, heap will be enough here]. -
Jim Mischel over 12 years@amit: Thank you. I misunderstood the description of the algorithm.
-
amit over 12 yearsThis is basically @Kevin 's solution
-
Vikas Mangal about 9 years@Terry Li: Instead of creating a new min heap, if we create a new max heap of size k and keep on inserting the elements from original min heap to new max heap. When the max heap is full, its root will have the kth smallest element and that heap will have the smallest k elements. Is my thinking right ?
-
madCode almost 9 yearsInstead of a sorted list, can't you just use a Queue and add to the queue smallest-largest children nodes to visit?
-
jiangok over 8 yearsIsn't deleting current root O(lgn) instead of O(lgk) because heapifying the original min heap is required after the deletion. So the overall complexity becomes kO(lgn) instead of kO(lgk)?
-
sharmakeshav about 7 years@VikasMangal No, that wouldn't work in klogk. That would again be a klogn algorithm because you will have to heapify the original min heap k times.
-
sharmakeshav about 7 years@jiangok There's no need to modify the original heap. You just read elements from the original heap and then modify your newly created heap. The new heap will be of max size k, thus it'll take O(logk) to heapify it.
-
user5054 over 6 years@TerryLi So, the new heap will be composed of pointers to the original heap nodes rather than the actual nodes? So, heapifying code for the new heap will be a little different?
-
Chaos almost 3 years@madCode Consider the min-heap
[1, 2, 100, 3, 4, 101, 102]
. Find 4th smallest value (which will be the value 4). Using a queue and the smallest - largest children approach, we get[1, 2, 100, 3, 4]
. This doesn't give us the right 4th smallest value. -
Jet C. over 2 yearsTo find kth smallest element, isn't it should be a new max heap instead of a min heap?
-
jacob12 almost 2 yearsAfter taking the current node's children to put in the auxiliary heap, this means that one of those two nodes must be the next minimum. But note that you have more nodes in the same level that we overpass and might even be more minimal. So we need to search the minimal node in the
i
-th level. So why we look only at the two children of a node when we have to look at nodes at the whole level and pick the minimal from there?