What's the proper way to write comparator as key in Python 3 for sorting?

10,655

This "comparison" function that you came up with is inconsistent: it should provide a definite (deterministic) order, meaning, if you change the order of the elements in the list and run sorted - you should get the same result!

In your case, the order of the elements effects the sorting:

import functools

def my_cmp(x, y):
    return x*5-y*2


l = [50, 2, 1, 9]
print(sorted(l, key=functools.cmp_to_key(my_cmp))) # [2, 1, 9, 50]

l = [50, 1, 2, 9]
print(sorted(l, key=functools.cmp_to_key(my_cmp))) # [1, 2, 9, 50]

which means that your "comparison" function is inconsistent. First provide good ordering function, then it should not be very difficult to convert it to a key function.


Regards the question that you raised in the comments, key accepts a function that takes only a single argument - and returns a "measurement" of "how big is it". The easiest example would be to compare numbers, in that case your key function can simply be: lambda x: x. For any number the lambda expression will returns itself and the comparison is now trivial!

Modifying your example:

def my_key(x):
    return x    

l = [50, 2, 1, 9]
print(sorted(l, key=my_key)) # [1, 2, 9, 50]

A shorter version of the above would be:

l = [50, 2, 1, 9]
print(sorted(l, key=lambda x: x)) # [1, 2, 9, 50]
Share:
10,655
Niner
Author by

Niner

Updated on July 26, 2022

Comments

  • Niner
    Niner almost 2 years

    I'm not sure how to write a comparator in Python 3 as the cmp parameter is removed. Considering the following code in Python 3, how do I rewrite the comparator using only key?

    import functools
    
    def my_cmp(x, y):
        return x*5-y*2
    
    l = [50, 2, 1, 9]
    print(sorted(l, key=functools.cmp_to_key(my_cmp)))
    

    thanks.

  • Niner
    Niner almost 9 years
    The "sorting" function is irrelevant here. it's just for demo purposes, it could very well be (x*3.14+y*2.71) . I just want to know what the proper way is to write a "cmp" function in Python 3 that can compare 2 values.
  • Nir Alfasi
    Nir Alfasi almost 9 years
    @Niner see the last paragraph.
  • Niner
    Niner almost 9 years
    So in your given example of my_kye(x), how do I factor in my custom "fake" formula of (x*5-y*2)? Where is the "Y" is this case?
  • Nir Alfasi
    Nir Alfasi almost 9 years
    @Niner as I said - your example cannot be used since it doesn't include deterministic order. I know it can get a bit confusing but please read the last paragraph again to see why you don't need that second argument (y).
  • Luke Dupin
    Luke Dupin about 8 years
    Its not confusing, python sorting is just limited, thus crappy. There is nothing wrong with sorting values that can't be "measured", that only have an order when compared to another. To say the only type of sorting that anyone could ever want to do is done by objects with a deterministic fitness value is silly. A much better answer to this question is, python doesn't support what you're trying to do.
  • Nir Alfasi
    Nir Alfasi about 8 years
    @LukeDupin I humbly disagree. There is a relation called "partial ordering" and using cmp "supports" it while the newer key approach forces the developer to think it carefully through and provide linear ordering. Forcing the user (developer) to do so prevents nasty bugs. Python language developers made a good choice by moving to linear ordering.
  • Luke Dupin
    Luke Dupin about 8 years
    Simple use case; you have an object, with 4 variables. 3 are integers, call them (scope, type, status), the last variable is a name, a string. There is nothing wrong with wanting to sort by groups of scopes then types then status and then alphabetical. This can be clearly written and maintained with a cmp. A key however requires hacks, or multiple levels of arrays/hashes to accomplish it. I've got nothing wrong with key, I have a huge issue with removing cmp. I appreciate your humble opinion, but key is limited, not in a healthy way.
  • Nir Alfasi
    Nir Alfasi about 8 years
    @LukeDupin key is not limited at all! it's true that key doesn't make your life easier - I never said so, what I did write is that it forces you to write better code because it forces you to think the ordering through. The example in the question shows it best - it might look right, but the OP ran into nondeterministic behavior because it does not define linear ordering well. This kind of behavior would be revealed with cmp only during run-time when the damage might have already been done...
  • Nir Alfasi
    Nir Alfasi about 8 years
    And btw, your example can be easily and clearly be implemented with key as well :)
  • Luke Dupin
    Luke Dupin about 8 years
    Only if you make assumptions on the maximum value for each category, which is sure to break later since a search algo shouldn't have that kind of deep knowledge about what its sorting. We are going to have to disagree on this subject. I will add, your arrogance and that of the python's team in this area of "I know every use case anyone should have" is what turns frameworks and languages into junk. In the mean time, I'm going to find a more flexible search algo or write my own qsort.
  • Nir Alfasi
    Nir Alfasi about 8 years
    @LukeDupin no you don't have to assume anything on the maximum value for each category. I'm sorry that you feel so emotional about it and that you're taking it to a personal level; a good discussion should remain professional and to the point. I truly wish you all the best!
  • Luke Dupin
    Luke Dupin about 8 years
    Post an example solution to the problem I outlined then.
  • Nir Alfasi
    Nir Alfasi about 8 years
    @LukeDupin key=lambda x: x.scope, x.type, x.status, x.name It returns a tuple with the fields by the order you want them sorted and does alphabeitically (dictionary) comparison between the tuples. Nice right ?
  • Luke Dupin
    Luke Dupin about 8 years
    @alfasin whether out of stubbornness or on principle, I still wont concede cmp isn't needed, that is however a very nice solution. I didn't realize you could send tuples of elements into the key like that. Thank you.
  • Nir Alfasi
    Nir Alfasi about 8 years
    @LukeDupin you're most welcome and you're certainly entitled to have your opinion! if you can come up with an example/use-case where cmp can do something that key doesn't (or, that cmp can do easily something that key will require much more work), please post it here.
  • Mad Physicist
    Mad Physicist over 7 years
    @LukeDupin. Take a look at the natsort library for a prime example on how to construct keys that might more easily work with cmp.
  • Luke Dupin
    Luke Dupin over 7 years
    @MadPhysicist Thanks!
  • Mad Physicist
    Mad Physicist over 7 years
    @LukeDupin. No problem. Turns out natsort does exactly the same thing that @alfasin suggests: splits the string into a tuple. I was originally under the impression that the library did something much more magical. Now I am 100% with you in that I would like to have cmp instead of, or at least as an alternative to key.
  • Nir Alfasi
    Nir Alfasi over 7 years
    @MadPhysicist your logic is not clear to me: you're saying that key is doing easily something that you'll have to work harder implementing with cmp and you still prefer cmp ? :)
  • Mad Physicist
    Mad Physicist over 7 years
    @alfasin. Not at all. For a case like natsort, using key is the right thing to do. I am just saying that even with tuple comparison, key is not always going to be an adequate way to express ordering criteria.
  • Mad Physicist
    Mad Physicist over 7 years
    @alfasin. Not sure if this is an example, but what if you wanted to sort ascending on the first element of the tuple and descending on the second?
  • Nir Alfasi
    Nir Alfasi over 7 years
    @MadPhysicist then I would do exactly what I did in the comments above but add a minus sign before the relevant element
  • Mad Physicist
    Mad Physicist over 7 years
    @alfasin. How would you handle non-numerical elements?
  • Nir Alfasi
    Nir Alfasi over 7 years
    @MadPhysicist by providing a function that rates these items.