Data structure for handling intervals

11,308

Solution 1

It sounds like you could just use a balanced binary tree of all the boundary times.

For example, represent {(1,4), (8,10), (12,15)} as a tree containing 1, 4, 8, 10, 12, and 15.

Each node needs to say whether it's the start or end of an interval. So:

                          8 (start)
                         /        \
                1 (start)         12 (start)
                      \             /      \
                     4 (end)   10 (end)   15 (end)

(Here all the "end" nodes ended up at the bottom by coincidence.)

Then I think you can have all your operations in O(log n) time. To add an interval:

  • Find the start time. If it's already in the tree as a start time, you can leave it there. If it's already in the tree as an end time, you'll want to remove it. If it's not in the tree and it doesn't fall during an existing interval, you'll want to add it. Otherwise you don't want to add it.

  • Find the stop time, using the same method to find out if you need to add it, remove it, or neither.

  • Now you just want to add or remove the abovementioned start and stop nodes and, at the same time, delete all the existing nodes in between. To do this you only need to rebuild the tree nodes at or directly above those two places in the tree. If the height of the tree is O(log n), which you can guarantee by using a balanced tree, this takes O(log n) time.

(Disclaimer: If you're in C++ and doing explicit memory management, you might end up freeing more than O(log n) pieces of memory as you do this, but really the time it takes to free a node should be billed to whoever added it, I think.)

Removing an interval is largely the same.

Checking a point or interval is straightforward.

Finding the first gap of at least a given size after a given time can be done in O(log n) too, if you also cache two more pieces of information per node:

  • In each start node (other than the leftmost), the size of the gap immediately to the left.

  • In every node, the size of the largest gap that appears in that subtree.

To find the first gap of a given size that appears after a given time, first find that time in the tree. Then walk up until you reach a node that claims to contain a large enough gap. If you came up from the right, you know this gap is to the left, so you ignore it and keep walking up. Otherwise you came from the left. If the node is a start node, check to see if the gap to its left is large enough. If so, you're done. Otherwise, the large-enough gap must be somewhere to the right. Walk down to the right and continue down until you find the gap. Again, because the height of the tree is O(log n), walking it three times (down, up, and possibly down again) is O(log n).

Solution 2

Without knowing anymore specifics, I'd suggest reading about Interval Trees. Interval trees are a special 1 dimensional case of more generic kd-trees, and have a O(n log n) construction time, and O(log n) typical operation times. Exact algorithm implementations you'd need to find yourself, but you can start by looking at CGAL.

Solution 3

I know you've already accepted an answer, but since you indicated that you will probably be implementing in C++, you could also have a look at Boosts Interval Container Library (http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html).

Solution 4

My interval tree implementation with AVL tree.

public class IntervalTreeAVL<T>{
    private static class TreeNode<T>{
        private T low;
        private T high;
        private TreeNode<T> left;
        private TreeNode<T> right;
        private T max;
        private int height;
        private TreeNode(T l, T h){
            this.low=l;
            this.high=h;
            this.max=high;
            this.height=1;
        }
    }
    private TreeNode<T> root;
    public void insert(T l, T h){
        root=insert(root, l, h);
    }
    private TreeNode<T> insert(TreeNode<T> node, T l, T h){
        if(node==null){
            return new TreeNode<T>(l, h);
        }
        else{
            int k=((Comparable)node.low).compareTo(l);
            if(k>0){
                node.left=insert(node.left, l, h);
            }
            else{
                node.right=insert(node.right, l, h);
            }
            node.height=Math.max(height(node.left), height(node.right))+1;
            node.max=findMax(node);
            int hd = heightDiff(node);
            if(hd<-1){
                int kk=heightDiff(node.right);
                if(kk>0){
                    node.right=rightRotate(node.right);
                    return leftRotate(node);
                }
                else{
                    return leftRotate(node);
                }
            }
            else if(hd>1){
                if(heightDiff(node.left)<0){
                    node.left = leftRotate(node.left);
                    return rightRotate(node);
                }
                else{
                    return rightRotate(node);
                } 
            }
            else;
        }
        return node;
    }
    private TreeNode<T> leftRotate(TreeNode<T> n){
        TreeNode<T> r =  n.right;
        n.right = r.left;
        r.left=n;
        n.height=Math.max(height(n.left), height(n.right))+1;
        r.height=Math.max(height(r.left), height(r.right))+1;
        n.max=findMax(n);
        r.max=findMax(r);
        return r;
    }
    private TreeNode<T> rightRotate(TreeNode<T> n){
        TreeNode<T> r =  n.left;
        n.left = r.right;
        r.right=n;
        n.height=Math.max(height(n.left), height(n.right))+1;
        r.height=Math.max(height(r.left), height(r.right))+1;
        n.max=findMax(n);
        r.max=findMax(r);
        return r;
    }
    private int heightDiff(TreeNode<T> a){
        if(a==null){
            return 0;
        }
        return height(a.left)-height(a.right);
    }
    private int height(TreeNode<T> a){
        if(a==null){
            return 0;
        }
        return a.height;
    }
    private T findMax(TreeNode<T> n){
        if(n.left==null && n.right==null){
            return n.max;
        }
        if(n.left==null){
            if(((Comparable)n.right.max).compareTo(n.max)>0){
                return n.right.max;
            }
            else{
                return n.max;
            }
        }
        if(n.right==null){
           if(((Comparable)n.left.max).compareTo(n.max)>0){
                return n.left.max;
            }
            else{
                return n.max;
            } 
        }
        Comparable c1 = (Comparable)n.left.max;
        Comparable c2 = (Comparable)n.right.max;
        Comparable c3 = (Comparable)n.max;
        T max=null;
        if(c1.compareTo(c2)<0){
            max=n.right.max;
        }
        else{
            max=n.left.max;
        }
        if(c3.compareTo((Comparable)max)>0){
            max=n.max;
        }
        return max;
    }


TreeNode intervalSearch(T t1){
        TreeNode<T> t = root;
        while(t!=null && !isInside(t, t1)){
            if(t.left!=null){
                    if(((Comparable)t.left.max).compareTo(t1)>0){
                    t=t.left;
                }
                else{
                    t=t.right;
                }
            }
            else{
                t=t.right;
            }
        }
        return t;
    }
    private boolean isInside(TreeNode<T> node, T t){
        Comparable cLow=(Comparable)node.low;
        Comparable cHigh=(Comparable)node.high;
        int i = cLow.compareTo(t);
        int j = cHigh.compareTo(t);
        if(i<=0 && j>=0){
            return true;
        }
        return false;
    }
}

Solution 5

I've just found Guava's Range and RangeSet which do exactly that.

It implements all the operations cited:

  1. Union

    RangeSet<Integer> intervals = TreeRangeSet.create(); 
    intervals.add(Range.closedOpen(1,4)); // stores {[1,4)}
    intervals.add(Range.closedOpen(8,10)); // stores {[1,4), [8,10)}
    // Now unite 3,7
    intervals.add(Range.closedOpen(3,7)); // stores {[1,7), [8,10)}
    
  2. Subraction

    intervals.remove(Range.closedOpen(3,5)); //stores {[1,3), [5, 7), [8, 10)}
    
  3. Intersection

    intervals.contains(3); // returns false
    intervals.contains(5); // returns true
    intervals.encloses(Range.closedOpen(2,4)); //returns false
    intervals.subRangeSet(Range.closedOpen(2,4)); // returns {[2,3)} (isEmpty returns false)
    intervals.subRangeSet(Range.closedOpen(3,5)).isEmpty(); // returns true
    
  4. Finding empty spaces (this will be the same complexity as a set iteration in the worst case):

    Range freeSpace(RangeSet<Integer> ranges, int size) {
        RangeSet<Integer> frees = intervals.complement().subRangeSet(Range.atLeast(0));
        for (Range free : frees.asRanges()) {
            if (!free.hasUpperBound()) {
                return free;
            }
            if (free.upperEndpoint() - free.lowerEndpoint() >= size) {
                return free;
            }
        }
    
Share:
11,308

Related videos on Youtube

Luís Guilherme
Author by

Luís Guilherme

Software Engineer who loves coding, functional languages, and coding challenges. Been on Stack Overflow for several years, somewhat actively.

Updated on January 05, 2020

Comments

  • Luís Guilherme
    Luís Guilherme over 4 years

    I have got a series of time intervals (t_start,t_end), that cannot overlap, i.e.: t_end(i) > t_start(i+1). I want to do the following operations:

    1) Add new (Union of) intervals [ {(1,4),(8,10)} U (3,7) = {(1,7),(8,10)} ]
    2) Take intervals out [ (1,7) - (3,5) = {(1,3),(5,7)}
    3) Checking whether a point or a interval overlaps with an interval in my series (intersection)
    4) Finding the first "non-interval" of a minimum length after some point [ {(1,4),(7,8)}: there is a "non-interval" of length 3 between 4 and 7 ].

    I want to know good ways of implementing this, with low complexities (log n for all operations would do it).

    Related question: Data structure for quick time interval look up

    • Charlie Salts
      Charlie Salts over 14 years
      You might mention what language you intend to use, if any. It might help us narrow your search.
    • Luís Guilherme
      Luís Guilherme over 14 years
      I intend to use C++, but I am comfortable with having to implement the data structure on my own, without relying on STL or whatsoever. So it's a "language independent" question! :)
  • Jason Orendorff
    Jason Orendorff over 14 years
    To add two general interval-sets is O(n) with this data structure. Adding a single interval to an interval-set is O(log n).
  • Luís Guilherme
    Luís Guilherme over 14 years
    I have yet to check for correctness, but this is a great answer anyway. It's not a simple algorithm, but if well encapsulated, it would be really easy to link.
  • schlamar
    schlamar almost 12 years
    @JasonOrendorff Finding a free gap is not correct. If you have to go down again you don't know in which direction. Solution would be to store largest gap left and largest gap right.
  • Jason Orendorff
    Jason Orendorff almost 12 years
    @ms4py Each node already stores the size of the largest gap it contains. The size of the largest gap to the left is just node.left.largestGap.
  • schlamar
    schlamar almost 12 years
    Yes, I just stumbled about that myself :)
  • schlamar
    schlamar almost 12 years
    @JasonOrendorff Updating the free gaps on insertion is a bottleneck. You have to update all ancestors of the inserted node and all ancestors of the successor (because you used his free space). And a rotation requires also an update of 2 nodes. Any idea to speed this up?
  • Jason Orendorff
    Jason Orendorff almost 12 years
    Why are you concerned about updates in particular? Are you profiling an implementation of this and seeing most of the time spent in updates?
  • Jason Orendorff
    Jason Orendorff almost 12 years
    I ask because abstractly, the updates are O(log n) just like every other piece.
  • kleopatra
    kleopatra over 10 years
    Note that link-only answers are discouraged, SO answers should be the end-point of a search for a solution (vs. yet another stopover of references, which tend to get stale over time). Please consider adding a stand-alone synopsis here, keeping the link as a reference.
  • plasmacel
    plasmacel almost 7 years
    This implementation doesn't support query of overlapping intervals, right? Can't see a method for that.