How to sort a list of objects based on an attribute of the objects?

763,315

Solution 1

# To sort the list in place...
ut.sort(key=lambda x: x.count, reverse=True)

# To return a new list, use the sorted() built-in function...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)

More on sorting by keys.

Solution 2

A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count"). However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:

try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda

ut.sort(key=keyfun, reverse=True) # sort in-place

Solution 3

Readers should notice that the key= method:

ut.sort(key=lambda x: x.count, reverse=True)

is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:

#!/usr/bin/env python
import random

class C:
    def __init__(self,count):
        self.count = count

    def __cmp__(self,other):
        return cmp(self.count,other.count)

longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]

longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs

My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).

Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.

Solution 4

Object-oriented approach

It's good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.

This ensures consistency and removes the need for boilerplate code.

At a minimum, you should specify __eq__ and __lt__ operations for this to work. Then just use sorted(list_of_objects).

class Card(object):

    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit

    def __eq__(self, other):
        return self.rank == other.rank and self.suit == other.suit

    def __lt__(self, other):
        return self.rank < other.rank

hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand]  # [10, 2, 12, 13, 14]

hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted]  # [2, 10, 12, 13, 14]

Solution 5

from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
Share:
763,315
Nick Sergeant
Author by

Nick Sergeant

Web developer, designer, what have you.

Updated on July 08, 2022

Comments

  • Nick Sergeant
    Nick Sergeant almost 2 years

    I have a list of Python objects that I want to sort by a specific attribute of each object:

    >>> ut
    [Tag(name="toe", count=10), Tag(name="leg", count=2), ...]
    

    How do I sort the list by .count in descending order?

  • Kenan Banks
    Kenan Banks over 15 years
    No problem. btw, if muhuk is right and it's a list of Django objects, you should consider his solution. However, for the general case of sorting objects, my solution is probably best practice.
  • Nick Sergeant
    Nick Sergeant over 15 years
    It is, but using django-tagging, so I was using a built-in for grabbing a Tag set by usage for a particular query set, like so: Tag.objects.usage_for_queryset(QuerySet, counts=True)
  • David Eyk
    David Eyk over 15 years
    On large lists you will get better performance using operator.attrgetter('count') as your key. This is just an optimized (lower level) form of the lambda function in this answer.
  • akaihola
    akaihola over 15 years
    Here I would use the variable name "keyfun" instead of "cmpfun" to avoid confusion. The sort() method does accept a comparison function through the cmp= argument as well.
  • Drew
    Drew over 11 years
    This doesn't seems to work if the object has dynamically added attributes, (if you've done self.__dict__ = {'some':'dict'} after the __init__ method). I don't know why it sould be different, though.
  • Ishbir
    Ishbir over 11 years
    @tutuca: I've never replaced the instance __dict__. Note that "an object having dynamically added attributes" and "setting an object's __dict__ attribute" are almost orthogonal concepts. I'm saying that because your comment seems to imply that setting the __dict__ attribute is a requirement for dynamically adding attributes.
  • Drew
    Drew over 11 years
    @tzot: I'm looking right at this: github.com/stochastic-technologies/goatfish/blob/master/… and using that iterator here: github.com/TallerTechnologies/dishey/blob/master/app.py#L28 raises attribute error. Maybe because of python3, but still...
  • Ishbir
    Ishbir over 11 years
    @tutuca: I would do self.__dict__.update(kwargs) instead of self.__dict__= kwargs. In any case, perhaps it's a Python 3 issue, since 2.7.3 seems to run it ok. I will investigate with Python 3 some time later.
  • Ishbir
    Ishbir over 11 years
    And then there's this, which could suggest it's the class Model's metaclass that's at fault here.
  • Drew
    Drew over 11 years
    @tzot, it's not django related, the goatfish Meta attribute is just a raw object with no magic whatsoever... I've tested it in a python 2.7 project and seems to work as expected. I'll need to read further on the issue...
  • IAbstract
    IAbstract about 8 years
    @tzot: if I understand the use of operator.attrgetter, I could supply a function with any property name and return a sorted collection.
  • dganesh2002
    dganesh2002 over 7 years
    Thanks for the great answer. In case if it is a list of dictionaries and 'count' is one of its key then it needs to be changed like below : ut.sort(key=lambda x: x['count'], reverse=True)
  • alxs
    alxs over 7 years
  • FriendFX
    FriendFX almost 5 years
    That's what I was looking for! Could you point us to some documentation that elaborates on why __eq__ and __lt__ are the minimum implementation requirements?
  • jpp
    jpp almost 5 years
    @FriendFX, I believe it's implied by this: •The sort routines are guaranteed to use __lt__() when making comparisons between two objects...
  • Ishbir
    Ishbir over 4 years
    Defining __cmp__ is equivalent to calling .sort(cmp=lambda), not .sort(key=lambda), so it isn't odd at all.
  • Bryan Roach
    Bryan Roach over 4 years
    @tzot is exactly right. The first sort has to compare objects against each other again and again. The second sort accesses each object only once to extract its count value, and then it performs a simple numerical sort which is highly optimized. A more fair comparison would be longList2.sort(cmp = cmp). I tried this out and it performed nearly the same as .sort(). (Also: note that the "cmp" sort parameter was removed in Python 3.)
  • Cornel Masson
    Cornel Masson about 4 years
    @FriendFX: See portingguide.readthedocs.io/en/latest/comparisons.html for Comparison and Sorting
  • uuu777
    uuu777 about 4 years
    I suppose it deserves the following update: if there is a need to sort by multiple fields, it could be achieved by consecutive calls to sort(), because python is using stable sort algorithm.
  • Naypa
    Naypa over 3 years
    cmp was deprecated in Python 3: docs.python.org/3/howto/…
  • mattsmith5
    mattsmith5 about 3 years
    I am receiving this error, can someone add in answer how to resolve it? ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
  • peetysmith
    peetysmith about 2 years
    I'm trying to use this technique for a dt object but getting an error. Code: events_2.sort(key=lambda x: x['DTSTART'].dt, reverse=True). Error: TypeError: can't compare offset-naive and offset-aware datetimes. Any ideas?
  • Kenan Banks
    Kenan Banks about 2 years
    @peetysmith it looks like youre 'DTSTART' column contains more than one date type (offset-naive and offset-aware). You need to first convert that column to use the same type, before the sort function can use it.
  • peetysmith
    peetysmith about 2 years
    Thanks @KenanBanks, you were right. Annoyingly outlook was doing some weird things with calendar timezones so that some came through without the timezone details... no idea why!
  • j-hap
    j-hap almost 2 years
    is there any way to automatically forward all special binary comparison methods to one attribute of the class instead of implementing __eq__, __lt__, __le__, __gt__, __ge__ and __ne__ and inside just forward to the attributes special function?
  • j-hap
    j-hap almost 2 years
    I just wrote my own decorator to accomplish what I wanted in the previous comment. it's really ugly.mbetter implement __eq__ and __lt__ and then use @functools.total_ordering to get the rest.