How to sort a list of objects based on an attribute of the objects?
Solution 1
# To sort the list in place...
ut.sort(key=lambda x: x.count, reverse=True)
# To return a new list, use the sorted() built-in function...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)
More on sorting by keys.
Solution 2
A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count")
. However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:
try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda
ut.sort(key=keyfun, reverse=True) # sort in-place
Solution 3
Readers should notice that the key= method:
ut.sort(key=lambda x: x.count, reverse=True)
is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:
#!/usr/bin/env python
import random
class C:
def __init__(self,count):
self.count = count
def __cmp__(self,other):
return cmp(self.count,other.count)
longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]
longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs
My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).
Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.
Solution 4
Object-oriented approach
It's good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.
This ensures consistency and removes the need for boilerplate code.
At a minimum, you should specify __eq__
and __lt__
operations for this to work. Then just use sorted(list_of_objects)
.
class Card(object):
def __init__(self, rank, suit):
self.rank = rank
self.suit = suit
def __eq__(self, other):
return self.rank == other.rank and self.suit == other.suit
def __lt__(self, other):
return self.rank < other.rank
hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand] # [10, 2, 12, 13, 14]
hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted] # [2, 10, 12, 13, 14]
Solution 5
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
Comments
-
Nick Sergeant almost 2 years
I have a list of Python objects that I want to sort by a specific attribute of each object:
>>> ut [Tag(name="toe", count=10), Tag(name="leg", count=2), ...]
How do I sort the list by
.count
in descending order?-
user1066101 over 15 years
-
Jeyekomon almost 6 yearsSorting HOW TO for those who are looking for more info about sorting in Python.
-
vijay shanker about 5 yearsapart from operator.attrgetter('attribute_name') you can also use functors as key like object_list.sort(key=my_sorting_functor('my_key')), leaving the implementation out intentionally.
-
-
Kenan Banks over 15 yearsNo problem. btw, if muhuk is right and it's a list of Django objects, you should consider his solution. However, for the general case of sorting objects, my solution is probably best practice.
-
Nick Sergeant over 15 yearsIt is, but using django-tagging, so I was using a built-in for grabbing a Tag set by usage for a particular query set, like so: Tag.objects.usage_for_queryset(QuerySet, counts=True)
-
David Eyk over 15 yearsOn large lists you will get better performance using operator.attrgetter('count') as your key. This is just an optimized (lower level) form of the lambda function in this answer.
-
akaihola over 15 yearsHere I would use the variable name "keyfun" instead of "cmpfun" to avoid confusion. The sort() method does accept a comparison function through the cmp= argument as well.
-
Drew over 11 yearsThis doesn't seems to work if the object has dynamically added attributes, (if you've done
self.__dict__ = {'some':'dict'}
after the__init__
method). I don't know why it sould be different, though. -
Ishbir over 11 years@tutuca: I've never replaced the instance
__dict__
. Note that "an object having dynamically added attributes" and "setting an object's__dict__
attribute" are almost orthogonal concepts. I'm saying that because your comment seems to imply that setting the__dict__
attribute is a requirement for dynamically adding attributes. -
Drew over 11 years@tzot: I'm looking right at this: github.com/stochastic-technologies/goatfish/blob/master/… and using that iterator here: github.com/TallerTechnologies/dishey/blob/master/app.py#L28 raises attribute error. Maybe because of python3, but still...
-
Ishbir over 11 years@tutuca: I would do
self.__dict__.update(kwargs)
instead ofself.__dict__= kwargs
. In any case, perhaps it's a Python 3 issue, since 2.7.3 seems to run it ok. I will investigate with Python 3 some time later. -
Ishbir over 11 yearsAnd then there's this, which could suggest it's the class Model's metaclass that's at fault here.
-
Drew over 11 years@tzot, it's not django related, the goatfish Meta attribute is just a raw object with no magic whatsoever... I've tested it in a python 2.7 project and seems to work as expected. I'll need to read further on the issue...
-
IAbstract about 8 years@tzot: if I understand the use of
operator.attrgetter
, I could supply a function with any property name and return a sorted collection. -
dganesh2002 over 7 yearsThanks for the great answer. In case if it is a list of dictionaries and 'count' is one of its key then it needs to be changed like below : ut.sort(key=lambda x: x['count'], reverse=True)
-
alxs over 7 yearsFor those looking for more info: wiki.python.org/moin/HowTo/Sorting#Operator_Module_Functions
-
FriendFX almost 5 yearsThat's what I was looking for! Could you point us to some documentation that elaborates on why
__eq__
and__lt__
are the minimum implementation requirements? -
jpp almost 5 years@FriendFX, I believe it's implied by this:
•The sort routines are guaranteed to use __lt__() when making comparisons between two objects...
-
Ishbir over 4 yearsDefining
__cmp__
is equivalent to calling.sort(cmp=lambda)
, not.sort(key=lambda)
, so it isn't odd at all. -
Bryan Roach over 4 years@tzot is exactly right. The first sort has to compare objects against each other again and again. The second sort accesses each object only once to extract its count value, and then it performs a simple numerical sort which is highly optimized. A more fair comparison would be
longList2.sort(cmp = cmp)
. I tried this out and it performed nearly the same as.sort()
. (Also: note that the "cmp" sort parameter was removed in Python 3.) -
Cornel Masson about 4 years@FriendFX: See portingguide.readthedocs.io/en/latest/comparisons.html for Comparison and Sorting
-
uuu777 about 4 yearsI suppose it deserves the following update: if there is a need to sort by multiple fields, it could be achieved by consecutive calls to sort(), because python is using stable sort algorithm.
-
Naypa over 3 yearscmp was deprecated in Python 3: docs.python.org/3/howto/…
-
mattsmith5 about 3 yearsI am receiving this error, can someone add in answer how to resolve it? ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
-
peetysmith about 2 yearsI'm trying to use this technique for a dt object but getting an error. Code: events_2.sort(key=lambda x: x['DTSTART'].dt, reverse=True). Error: TypeError: can't compare offset-naive and offset-aware datetimes. Any ideas?
-
Kenan Banks about 2 years@peetysmith it looks like youre 'DTSTART' column contains more than one date type (offset-naive and offset-aware). You need to first convert that column to use the same type, before the sort function can use it.
-
peetysmith about 2 yearsThanks @KenanBanks, you were right. Annoyingly outlook was doing some weird things with calendar timezones so that some came through without the timezone details... no idea why!
-
j-hap almost 2 yearsis there any way to automatically forward all special binary comparison methods to one attribute of the class instead of implementing
__eq__
,__lt__
,__le__
,__gt__
,__ge__
and__ne__
and inside just forward to the attributes special function? -
j-hap almost 2 yearsI just wrote my own decorator to accomplish what I wanted in the previous comment. it's really ugly.mbetter implement
__eq__
and__lt__
and then use@functools.total_ordering
to get the rest.