How to make heapq evaluate the heap off of a specific attribute?

92,641

Solution 1

heapq sorts objects the same way list.sort does, so just define a method __cmp__() within your class definition, which will compare itself to another instance of the same class:

def __cmp__(self, other):
    return cmp(self.intAttribute, other.intAttribute)

Works in Python 2.x.

In 3.x use:

def __lt__(self, other):
    return self.intAttribute < other.intAttribute

Solution 2

According to the example from the documentation, you can use tuples, and it will sort by the first element of the tuple:

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))
>>> heappop(h)
(1, 'write spec')

So if you don't want to (or can't?) do a __cmp__ method, you can manually extract your sorting key at push time.

Note that if the first elements in a pair of tuples are equal, further elements will be compared. If this is not what you want, you need to ensure that each first element is unique.

Solution 3

According to the Official Document, a solution to this is to store entries as tuples (please take a look at Section 8.4.1 and 8.4.2).

For example, your object is something like this in tuple's format (key, value_1, value_2)

When you put the objects (i.e. tuples) into heap, it will take the first attribute in the object (in this case is key) to compare. If a tie happens, the heap will use the next attribute (i.e. value_1) and so on.

For example:

import heapq

heap = []
heapq.heappush(heap, (0,'one', 1))
heapq.heappush(heap, (1,'two', 11))
heapq.heappush(heap, (1, 'two', 2))
heapq.heappush(heap, (1, 'one', 3))
heapq.heappush(heap, (1,'two', 3))
heapq.heappush(heap, (1,'one', 4))
heapq.heappush(heap, (1,'two', 5))
heapq.heappush(heap, (1,'one', 1))

show_tree(heap)

Output:

                                      (0, 'one', 1)                                       
                (1, 'one', 1)                                (1, 'one', 4)                
    (1, 'one', 3)         (1, 'two', 3)         (1, 'two', 2)         (1, 'two', 5)     
(1, 'two', 11)

About pretty print a heap in python (updated the link): show_tree()

Solution 4

I feel the simplest way is to override the existing cmp_lt function of the heapq module. A short example:

import heapq

# your custom function. Here, comparing tuples a and b based on their 2nd element
def new_cmp_lt(self,a,b):
    return a[1]<b[1]

#override the existing "cmp_lt" module function with your function
heapq.cmp_lt=new_cmp_lt

#Now use everything like normally used

Note: Somebody more qualified should comment if this conflicts with recommended coding practices. But it can still be useful for something "quick & dirty" e.g. in coding interviews with limited time and a lot more to do instead of spending time on subclassing correctly.

Solution 5

I had the same question but none of the above answers hit the spot although some were close but not elaborated enough. Anyway, I did some research and tried this piece of code and hopefully this should be sufficient for someone next who is looking to get an answer:

The problem with using a tuple is it only uses the first item which is not very flexible. I wanted something similar to std::priority_queue in c++ like this: std::priority_queue<pair<int, int>, vector<pair<int, int>>, comparator> pq; where I could design my own comparator which is more common in real world applications.

Hopefully the below snippet helps: https://repl.it/@gururajks/EvenAccurateCylinders

import heapq
class PQNode:

    def __init__(self, key, value):
        self.key = key
        self.value = value

    # compares the second value
    def __lt__(self, other):
        return self.value < other.value

    def __str__(self):
        return str("{} : {}".format(self.key, self.value))

input = [PQNode(1, 4), PQNode(7, 4), PQNode(6, 9), PQNode(2, 5)]
hinput = []
for item in input:
    heapq.heappush(hinput, item)

while (hinput):
    print (heapq.heappop(hinput))
Share:
92,641

Related videos on Youtube

coffee
Author by

coffee

Updated on October 21, 2021

Comments

  • coffee
    coffee over 2 years

    I wish to hold a heap of objects, not just numbers. They will have an integer attribute in them that the heap can sort by. The easiest way to use heaps in python is heapq, but how do I tell it to sort by a specific attribute when using heapq?

  • Daniel Stutzbach
    Daniel Stutzbach over 13 years
    __cmp__ is gone in 3.x. Use __lt__ instead.
  • Daniel Stutzbach
    Daniel Stutzbach over 13 years
    __lt__ works in Python 2 also, so it's better to just avoid __cmp__ altogether.
  • Glenn Maynard
    Glenn Maynard over 13 years
    Just as you can tell any sort to sort based on a criteria other than the object's natural sorting (eg. cmp and key for sort), you should be able to tell heapq to sort based on a different key. In other words, you shouldn't have to redefine the object itself to change a particular data structure holding it; you should be able to just tell the data structure itself. This is a notable fundamental piece missing from the heapq API.
  • Ishbir
    Ishbir over 13 years
    I removed my previous comment since my issue with blist was probably a PEBCAK (again thanks for your module), so I only duplicate the first part of the previous comment: It's always possible to define a class with an __lt__ either through subclassing or through encapsulation.
  • JD Gamboa
    JD Gamboa over 5 years
    "Note that if the first elements in a pair of tuples are equal, further elements will be compared." You should put that in bold since in the documentation it is not clear. I assumed given the same priority it would return me the first object found (no good reason for that assumption, so it's my fault, I see).
  • Fred Guth
    Fred Guth about 5 years
    Good point. If you insert a tuple that is (number, dict) it doesn't know how to evaluate dicts.
  • akshaynagpal
    akshaynagpal almost 5 years
    If you have a tuple like (some_value, dict), you can insert (some_value, counter, dict) in the heap to break ties with an incrementing counter in case some_value is equal for 2 tuples.
  • penguin2718
    penguin2718 almost 5 years
    I tried your code and it works on my end. I'm using python 3.6.5. I am curious as to how heappush() does the comparision. Is this done intrinsically by the special _lt_() method in the PQNode class? Without it, this program definitely crashes with the compiler message: Traceback (most recent call last): File "heap_example.py", line 18, in <module> heapq.heappush(hinput, item) TypeError: '<' not supported between instances of 'PQNode' and 'PQNode' Fortunately, it seems _lt_() plays a role in it because it's working.
  • Shrikant Shete
    Shrikant Shete over 4 years
    is there any reason everyone asks to use __lt__ and not __gt__? or it really doesn't matter?
  • jallen0927
    jallen0927 about 4 years
    What if sometimes I want to sort by this attribute and sometimes sort by another attribute?
  • sanjay patel
    sanjay patel over 3 years
    This example did not work for me. Any suggestions? lst = [(18, [3, 3]), (26, [5, -1]), (20, [-2, 4])] heapq.heapify(lst)