built-in max heap API in Python

19,308

Solution 1

In the past I have simply used sortedcontainers's SortedList for this, as:

> a = SortedList()
> a.add(3)
> a.add(2)
> a.add(1)
> a.pop()
3

It's not a heap, but it's fast and works directly as required.

If you absolutely need it to be a heap, you could make a general negation class to hold your items.

class Neg():
    def __init__(self, x):
        self.x = x

    def __cmp__(self, other):
        return -cmp(self.x, other.x)

def maxheappush(heap, item):
    heapq.heappush(heap, Neg(item))

def maxheappop(heap):
    return heapq.heappop(heap).x

But that will be using a little more memory.

Solution 2

There is a _heappop_max function in the latest cpython source that you may find useful:

def _heappop_max(heap):
    """Maxheap version of a heappop."""
    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        heapq._siftup_max(heap, 0)
        return returnitem
    return lastelt

If you change the heappush logic using heapq._siftdown_max you should get the desired output:

def _heappush_max(heap, item):
    heap.append(item)
    heapq._siftdown_max(heap, 0, len(heap)-1)


def _heappop_max(heap):
    """Maxheap version of a heappop."""
    lastelt = heap.pop()  # raises appropriate IndexError if heap is empty
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        heapq._siftup_max(heap, 0)
        return returnitem
    return lastelt


def heapsort2(iterable):
    h = []
    heapq._heapify_max(h)
    for value in iterable:
        _heappush_max(h, value)
    return [_heappop_max(h) for i in range(len(h))]

Output:

In [14]: heapsort2([1,3,6,2,7,9,0,4,5,8])
Out[14]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [15]: heapsort2([7, 8, 9, 6, 4, 2, 3, 5, 1, 0])
Out[15]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [16]: heapsort2([19,13,15,17,11,10,14,20,18])
Out[16]: [20, 19, 18, 17, 15, 14, 13, 11, 10]

In [17]: heapsort2(["foo","bar","foobar","baz"])
Out[17]: ['foobar', 'foo', 'baz', 'bar']
Share:
19,308
Lin Ma
Author by

Lin Ma

Updated on June 13, 2022

Comments

  • Lin Ma
    Lin Ma almost 2 years

    Default heapq is min queue implementation and wondering if there is an option for max queue? Thanks.

    I tried the solution using _heapify_max for max heap, but how to handle dynamically push/pop element? It seems _heapify_max could only be used during initialization time.

    import heapq
    
    def heapsort(iterable):
        h = []
        for value in iterable:
            heapq.heappush(h, value)
        return [heapq.heappop(h) for i in range(len(h))]
    
    if __name__ == "__main__":
    
        print heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
    

    Edit, tried _heapify_max seems not working for dynamically push/pop elements. I tried both methods output the same, both output is, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].

    def heapsort(iterable):
        h = []
        for value in iterable:
            heapq.heappush(h, value)
        return [heapq.heappop(h) for i in range(len(h))]
    
    def heapsort2(iterable):
        h = []
        heapq._heapify_max(h)
        for value in iterable:
            heapq.heappush(h, value)
        return [heapq.heappop(h) for i in range(len(h))]
    
    if __name__ == "__main__":
    
        print heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
        print heapsort2([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
    

    Thanks in advance, Lin

  • Lin Ma
    Lin Ma over 8 years
    good catch. Wondering if your solution works for string, besides numeric types?
  • U2EF1
    U2EF1 over 8 years
    @LinMa That's what the wrapper's for. If you know your data is numeric you can do a little better by just using - as in def maxheappush(heap, item): heapq.heappush(heap, -item) and def maxheappop(heap): return -heapq.heappop(heap).
  • Lin Ma
    Lin Ma over 8 years
    Thanks U2EF1, how and when cmp is called and leveraged?
  • U2EF1
    U2EF1 over 8 years
    @LinMa All the time? It's a builtin for comparisons.
  • Lin Ma
    Lin Ma over 8 years
    Thanks U2EF1, sorry I may not make myself understood. I mean does heappush or heappop call cmp underlying -- I ask this since I do not see cmp is called explicitly? If so, more details are appreciated.
  • U2EF1
    U2EF1 over 8 years
    @LinMa Yes, operations like a < b on Neg instances will call __cmp__. You can add some print statements if you want to see it being called.
  • anveshtummala
    anveshtummala over 6 years
    How to implement a peek/ get_top function on this max heap?
  • U2EF1
    U2EF1 over 6 years
    @anveshtummala With indexing, ala a[-1].
  • Thariq Nugrohotomo
    Thariq Nugrohotomo almost 4 years
    sorry for necromancing the old thread, but in response to the last comment above, I think it should be a[0] to peek the max item (i.e. item at the root of the heap/tree).
  • Thariq Nugrohotomo
    Thariq Nugrohotomo almost 4 years
    The class Neg approach is broken on Python 3. Sadly implementing just __cmp__ only is not sufficient anymore :'(
  • avmohan
    avmohan almost 3 years
    For python3, implementing __lt__ is enough.