How to make heapq evaluate the heap off of a specific attribute?
Solution 1
heapq
sorts objects the same way list.sort
does, so just define a method __cmp__()
within your class definition, which will compare itself to another instance of the same class:
def __cmp__(self, other):
return cmp(self.intAttribute, other.intAttribute)
Works in Python 2.x.
In 3.x use:
def __lt__(self, other):
return self.intAttribute < other.intAttribute
Solution 2
According to the example from the documentation, you can use tuples, and it will sort by the first element of the tuple:
>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))
>>> heappop(h)
(1, 'write spec')
So if you don't want to (or can't?) do a __cmp__
method, you can manually extract your sorting key at push time.
Note that if the first elements in a pair of tuples are equal, further elements will be compared. If this is not what you want, you need to ensure that each first element is unique.
Solution 3
According to the Official Document, a solution to this is to store entries as tuples (please take a look at Section 8.4.1 and 8.4.2).
For example, your object is something like this in tuple's format (key, value_1, value_2)
When you put the objects (i.e. tuples) into heap, it will take the first attribute in the object (in this case is key) to compare. If a tie happens, the heap will use the next attribute (i.e. value_1) and so on.
For example:
import heapq
heap = []
heapq.heappush(heap, (0,'one', 1))
heapq.heappush(heap, (1,'two', 11))
heapq.heappush(heap, (1, 'two', 2))
heapq.heappush(heap, (1, 'one', 3))
heapq.heappush(heap, (1,'two', 3))
heapq.heappush(heap, (1,'one', 4))
heapq.heappush(heap, (1,'two', 5))
heapq.heappush(heap, (1,'one', 1))
show_tree(heap)
Output:
(0, 'one', 1)
(1, 'one', 1) (1, 'one', 4)
(1, 'one', 3) (1, 'two', 3) (1, 'two', 2) (1, 'two', 5)
(1, 'two', 11)
About pretty print a heap in python (updated the link): show_tree()
Solution 4
I feel the simplest way is to override the existing cmp_lt function of the heapq module. A short example:
import heapq
# your custom function. Here, comparing tuples a and b based on their 2nd element
def new_cmp_lt(self,a,b):
return a[1]<b[1]
#override the existing "cmp_lt" module function with your function
heapq.cmp_lt=new_cmp_lt
#Now use everything like normally used
Note: Somebody more qualified should comment if this conflicts with recommended coding practices. But it can still be useful for something "quick & dirty" e.g. in coding interviews with limited time and a lot more to do instead of spending time on subclassing correctly.
Solution 5
I had the same question but none of the above answers hit the spot although some were close but not elaborated enough. Anyway, I did some research and tried this piece of code and hopefully this should be sufficient for someone next who is looking to get an answer:
The problem with using a tuple is it only uses the first item which is not very flexible. I wanted something similar to std::priority_queue in c++ like this:
std::priority_queue<pair<int, int>, vector<pair<int, int>>, comparator> pq;
where I could design my own comparator which is more common in real world applications.
Hopefully the below snippet helps: https://repl.it/@gururajks/EvenAccurateCylinders
import heapq
class PQNode:
def __init__(self, key, value):
self.key = key
self.value = value
# compares the second value
def __lt__(self, other):
return self.value < other.value
def __str__(self):
return str("{} : {}".format(self.key, self.value))
input = [PQNode(1, 4), PQNode(7, 4), PQNode(6, 9), PQNode(2, 5)]
hinput = []
for item in input:
heapq.heappush(hinput, item)
while (hinput):
print (heapq.heappop(hinput))
Related videos on Youtube
coffee
Updated on October 21, 2021Comments
-
coffee over 2 years
I wish to hold a heap of objects, not just numbers. They will have an integer attribute in them that the heap can sort by. The easiest way to use heaps in python is heapq, but how do I tell it to sort by a specific attribute when using heapq?
-
Daniel Stutzbach over 13 years
__cmp__
is gone in 3.x. Use__lt__
instead. -
Daniel Stutzbach over 13 years
__lt__
works in Python 2 also, so it's better to just avoid__cmp__
altogether. -
Glenn Maynard over 13 yearsJust as you can tell any sort to sort based on a criteria other than the object's natural sorting (eg.
cmp
andkey
forsort
), you should be able to tellheapq
to sort based on a different key. In other words, you shouldn't have to redefine the object itself to change a particular data structure holding it; you should be able to just tell the data structure itself. This is a notable fundamental piece missing from theheapq
API. -
Ishbir over 13 yearsI removed my previous comment since my issue with
blist
was probably a PEBCAK (again thanks for your module), so I only duplicate the first part of the previous comment: It's always possible to define a class with an__lt__
either through subclassing or through encapsulation. -
JD Gamboa over 5 years"Note that if the first elements in a pair of tuples are equal, further elements will be compared." You should put that in bold since in the documentation it is not clear. I assumed given the same priority it would return me the first object found (no good reason for that assumption, so it's my fault, I see).
-
Fred Guth about 5 yearsGood point. If you insert a tuple that is (number, dict) it doesn't know how to evaluate dicts.
-
akshaynagpal almost 5 yearsIf you have a tuple like
(some_value, dict)
, you can insert(some_value, counter, dict)
in the heap to break ties with an incrementing counter in casesome_value
is equal for 2 tuples. -
penguin2718 almost 5 yearsI tried your code and it works on my end. I'm using python 3.6.5. I am curious as to how heappush() does the comparision. Is this done intrinsically by the special _lt_() method in the PQNode class? Without it, this program definitely crashes with the compiler message: Traceback (most recent call last): File "heap_example.py", line 18, in <module> heapq.heappush(hinput, item) TypeError: '<' not supported between instances of 'PQNode' and 'PQNode' Fortunately, it seems _lt_() plays a role in it because it's working.
-
Shrikant Shete over 4 yearsis there any reason everyone asks to use
__lt__
and not__gt__
? or it really doesn't matter? -
jallen0927 about 4 yearsWhat if sometimes I want to sort by this attribute and sometimes sort by another attribute?
-
sanjay patel over 3 yearsThis example did not work for me. Any suggestions? lst = [(18, [3, 3]), (26, [5, -1]), (20, [-2, 4])] heapq.heapify(lst)