Find kth smallest element in a binary search tree in Optimum way

131,039

Solution 1

Here's just an outline of the idea:

In a BST, the left subtree of node T contains only elements smaller than the value stored in T. If k is smaller than the number of elements in the left subtree, the kth smallest element must belong to the left subtree. Otherwise, if k is larger, then the kth smallest element is in the right subtree.

We can augment the BST to have each node in it store the number of elements in its left subtree (assume that the left subtree of a given node includes that node). With this piece of information, it is simple to traverse the tree by repeatedly asking for the number of elements in the left subtree, to decide whether to do recurse into the left or right subtree.

Now, suppose we are at node T:

  1. If k == num_elements(left subtree of T), then the answer we're looking for is the value in node T.
  2. If k > num_elements(left subtree of T), then obviously we can ignore the left subtree, because those elements will also be smaller than the kth smallest. So, we reduce the problem to finding the k - num_elements(left subtree of T) smallest element of the right subtree.
  3. If k < num_elements(left subtree of T), then the kth smallest is somewhere in the left subtree, so we reduce the problem to finding the kth smallest element in the left subtree.

Complexity analysis:

This takes O(depth of node) time, which is O(log n) in the worst case on a balanced BST, or O(log n) on average for a random BST.

A BST requires O(n) storage, and it takes another O(n) to store the information about the number of elements. All BST operations take O(depth of node) time, and it takes O(depth of node) extra time to maintain the "number of elements" information for insertion, deletion or rotation of nodes. Therefore, storing information about the number of elements in the left subtree keeps the space and time complexity of a BST.

Solution 2

A simpler solution would be to do an inorder traversal and keep track of the element currently to be printed (without printing it). When we reach k, print the element and skip rest of tree traversal.

void findK(Node* p, int* k) {
  if(!p || k < 0) return;
  findK(p->left, k);
  --k;
  if(k == 0) { 
    print p->data;
    return;  
  } 
  findK(p->right, k); 
}

Solution 3

public int ReturnKthSmallestElement1(int k)
    {
        Node node = Root;

        int count = k;

        int sizeOfLeftSubtree = 0;

        while(node != null)
        {

            sizeOfLeftSubtree = node.SizeOfLeftSubtree();

            if (sizeOfLeftSubtree + 1 == count)
                return node.Value;
            else if (sizeOfLeftSubtree < count)
            {
                node = node.Right;
                count -= sizeOfLeftSubtree+1;
            }
            else
            {
                node = node.Left;
            }
        }

        return -1;
    }

this is my implementation in C# based on the algorithm above just thought I'd post it so people can understand better it works for me

thank you IVlad

Solution 4

//add a java version without recursion

public static <T> void find(TreeNode<T> node, int num){
    Stack<TreeNode<T>> stack = new Stack<TreeNode<T>>();

    TreeNode<T> current = node;
    int tmp = num;

    while(stack.size() > 0 || current!=null){
        if(current!= null){
            stack.add(current);
            current = current.getLeft();
        }else{
            current = stack.pop();
            tmp--;

            if(tmp == 0){
                System.out.println(current.getValue());
                return;
            }

            current = current.getRight();
        }
    }
}

Solution 5

A simpler solution would be to do an inorder traversal and keep track of the element currently to be printed with a counter k. When we reach k, print the element. The runtime is O(n). Remember the function return type can not be void, it has to return its updated value of k after each recursive call. A better solution to this would be an augmented BST with a sorted position value at each node.

public static int kthSmallest (Node pivot, int k){
    if(pivot == null )
        return k;   
    k = kthSmallest(pivot.left, k);
    k--;
    if(k == 0){
        System.out.println(pivot.value);
    }
    k = kthSmallest(pivot.right, k);
    return k;
}
Share:
131,039

Related videos on Youtube

bragboy
Author by

bragboy

I am a seriously passionate non-stop learner. I am an entrepreneur, love experimenting, and exploring new things. I blog here and occasionally here as well

Updated on July 08, 2022

Comments

  • bragboy
    bragboy almost 2 years

    I need to find the kth smallest element in the binary search tree without using any static/global variable. How to achieve it efficiently? The solution that I have in my mind is doing the operation in O(n), the worst case since I am planning to do an inorder traversal of the entire tree. But deep down I feel that I am not using the BST property here. Is my assumptive solution correct or is there a better one available ?

    • kennytm
      kennytm about 14 years
      Is the tree balanced?
    • bragboy
      bragboy about 14 years
      Its not. But if it were balanced, is there an optimum way?
    • RAL
      RAL about 14 years
      If you do a search on "Order Statistics" you will find what you need.
    • Henley
      Henley almost 11 years
      I sort of feel most of the answers below, while correct are cheating in that they are using a global variable of some sort (whether it's a reference to an integer, or a variable that gets decremented and returned). If absolutely none of those are allowed, I would use recursion without any references being passed in.
  • bragboy
    bragboy about 14 years
    @Anon. Thanks. But that is something that I am already aware of which is O(n) complex. My question is whether it has a better solution
  • kennytm
    kennytm about 14 years
    Meh, better call it O(k). I can also say OP's solution O(1) for constant values of N.
  • IVlad
    IVlad about 14 years
    Saying that's "O(1) for constant values of k" is kind of cheating. I can say any algorithm is O(1) like that...
  • Anon.
    Anon. about 14 years
    This is O(1) for constant values of k, regardless of the size of the tree. If k varies, it's O(k), but still independent of the tree size.
  • bragboy
    bragboy about 14 years
    Well, yeah we should not be carried away what value is inside big O. It merely signifies the 'rate' at which the function grows.
  • IVlad
    IVlad about 14 years
    Why consider k a constant anyway? If it's a constant you can just precompute the result and of course it's O(1). And if k varies, it varies between 1 and the number of nodes in the tree, so it's not really independent of the tree size, therefore it's O(N).
  • Jerry Coffin
    Jerry Coffin about 14 years
    To find the Nth smallest item, you only need to store the size of the left sub-tree. You'd use the size of the right sub-tree iif you also wanted to be able to find the Nth largest item. Actually, you can make that less expensive though: store the total size of the tree in the root, and the size of the left sub-tree. When you need to size of the right sub-tree, you can subtract the size of the left from the total size.
  • Anon.
    Anon. about 14 years
    @|\/|ad: You will notice that even if k is constant, the size of the dataset is likely not.
  • IVlad
    IVlad about 14 years
    @Jerry Coffin: true, nice observation :).
  • RAL
    RAL about 14 years
    please...how could it be independent of the tree size? Try finding the smallest element... is that O(1)? So how could finding the k'th smallest possibly be O(1)?
  • IVlad
    IVlad about 14 years
    @Anon: Yes, which makes it O(N) to precompute the result and O(1) to answer a query using an inorder traversal. Anyway, my point is that assuming k constant is making an unfounded assumption, as the question does not state that.
  • Admin
    Admin about 14 years
    +1 to Robert's comment. Even finding min (k=1) is O(log n) in balanced tree. (and -1 to answer).
  • Para
    Para almost 14 years
    I don't understand something. You say "k>T". k is a position like first smallest, 2nd smallest and so on. T is the value of a node right? Why compare them? And if T is also the order of a node(this is not clear to me sorry) how do I find it?
  • IVlad
    IVlad almost 14 years
    @Para - I used T to mean the number of elements in node T's left subtree (this is a bit unclear, yes, I'll edit). You find it when building the tree: once you insert a node (if you do it recursively it's easier), after you return from the recursive call on the left subtree, increment the count in the current node. Let me know if it's still not clear.
  • Daniel
    Daniel almost 14 years
    Such an augmented BST is called an 'order statistics tree'.
  • understack
    understack over 13 years
    @Ivlad: in step 2: I think "k - num_elements" should be "k - num_elements -1", since you've to include root element too.
  • IVlad
    IVlad over 13 years
    @understack - not if you assume the root to be part of the subtree.
  • Anil Vishnoi
    Anil Vishnoi over 13 years
    i don't think this solution will work. What if the Kth smallest is in the right sub tree of the tree node ?
  • Anil Vishnoi
    Anil Vishnoi over 13 years
    k == num_elements(left subtree of T) .. this should be k == num_elements(left subtree of T) +1.
  • Robert S. Barnes
    Robert S. Barnes over 13 years
    If the tree doesn't contain a field containing the "number of elements in its left and right subtree" then the method will end up being BigO( n ) as you will need to walk the right or left subtree at each node in order to calculate the k index of the current node.
  • bragboy
    bragboy over 13 years
    this is not an optimum solution
  • kgadek
    kgadek over 12 years
    -1 because the answer is misleading (it's NOT O(1), it's O(N) for BST and O(log N) for balanced tree).
  • TheOne
    TheOne almost 12 years
    @RobertS.Barnes, so that would make an inorder traversal have the same order complexity? I.e., traverse inorder until you reach k.
  • Woot4Moo
    Woot4Moo over 11 years
    No metrics provided as to why this is optimal. In both large and small cases
  • Ren
    Ren over 11 years
    Please always accompany code with some level of description as to what it does and how it helps solve the issue.
  • Henley
    Henley almost 11 years
    how about the java version?
  • Henley
    Henley almost 11 years
    I like this solution and the corresponding recursive one. Honestly, most of the answers to this question are too confusing/complex to read.
  • Nitin Garg
    Nitin Garg about 10 years
    What about the case with duplicate values? Question din't say you can assume uniqueness.
  • Arun
    Arun about 10 years
    +1: The idea is in the right direction, but some loose ends may need to be tightened; see stackoverflow.com/a/23069077/278326
  • nevets1219
    nevets1219 almost 10 years
    @Nitin Garg: a BST by definition doesn't have duplicate values -see en.wikipedia.org/wiki/Binary_search_tree
  • Valentin Shergin
    Valentin Shergin over 9 years
    Time complexity for this solution is O(k + d), where d is max depth of the tree. Therefore it uses a global variable counter but it's illegal for this question.
  • Andy897
    Andy897 over 9 years
    Hi Arun, can you please explain with an example. I am not understand this particularly your first point.
  • Merlin
    Merlin almost 9 years
    I like this solution, since BST is already ordered, a traversal should be enough.
  • Alan Tam
    Alan Tam almost 9 years
    @RobertS.Barnes If your tree does not store "the number of elements" info on the nodes, then you have to traverse the entire left subtree to check the order, so time can be worse than O(n), indeed Θ(n^2) in the worst case (on a very left-skewed tree). But of course this algorithm is only useful when you have that info (the sizes), and it only takes O(n) extra memory and O(1) extra time to maintain them, so a good deal if you need to do many rank queries on your tree.
  • Spark8006
    Spark8006 almost 9 years
    If n is close to the total number of nodes in this tree, your algorithm will take O(n) time to finish, which is bad for the selected answer-O(log n)
  • harishvc
    harishvc over 8 years
    @IVlad amazing algorithm! You got me thinking. Python3 solution posted at https://github.com/harishvc/challenges/blob/master/binary-se‌​arch-tree-smallestK.‌​py
  • Rugal
    Rugal over 8 years
    I love this solution! Clear and great!
  • Jorge P.
    Jorge P. over 7 years
    This solution is traversing the tree 'in-order' and decreasing a counter after visiting the node, to later stop when the counter gets equals to zero. The worst case is then of order O(n). Not the most optimal comparing with @IVlad's recursive solutions whose worst case takes O(log n)
  • zach
    zach over 7 years
    I guess your solution is better in terms of space complexity, compared with the augmented BST.
  • FrontSide
    FrontSide over 5 years
    @AlanTam could you explain how it would ever be worse than O(n)? We have an ordered tree, so we will at a maximum have to look at each node in the tree once to find the kth smallest element which is O(n).
  • Alan Tam
    Alan Tam over 5 years
    @FrontSide nothing is worse than O(n). The time required to locate a node using this algorithm is merely O(depth of node), which is merely O(log n) as long as the tree is properly balanced.
  • FrontSide
    FrontSide over 5 years
    @AlanTam That's what I thought. Thanks for clarifying. I must have misinterpreted your comment above.
  • Vineeth Chitteti
    Vineeth Chitteti over 5 years
    The search doesn't stop even after the k-th smallest element is found.