What's the proper way to write comparator as key in Python 3 for sorting?
This "comparison" function that you came up with is inconsistent: it should provide a definite (deterministic) order, meaning, if you change the order of the elements in the list and run sorted
- you should get the same result!
In your case, the order of the elements effects the sorting:
import functools
def my_cmp(x, y):
return x*5-y*2
l = [50, 2, 1, 9]
print(sorted(l, key=functools.cmp_to_key(my_cmp))) # [2, 1, 9, 50]
l = [50, 1, 2, 9]
print(sorted(l, key=functools.cmp_to_key(my_cmp))) # [1, 2, 9, 50]
which means that your "comparison" function is inconsistent. First provide good ordering function, then it should not be very difficult to convert it to a key
function.
Regards the question that you raised in the comments, key
accepts a function that takes only a single argument - and returns a "measurement" of "how big is it". The easiest example would be to compare numbers, in that case your key function can simply be: lambda x: x
. For any number the lambda expression will returns itself and the comparison is now trivial!
Modifying your example:
def my_key(x):
return x
l = [50, 2, 1, 9]
print(sorted(l, key=my_key)) # [1, 2, 9, 50]
A shorter version of the above would be:
l = [50, 2, 1, 9]
print(sorted(l, key=lambda x: x)) # [1, 2, 9, 50]
Niner
Updated on July 26, 2022Comments
-
Niner almost 2 years
I'm not sure how to write a comparator in Python 3 as the cmp parameter is removed. Considering the following code in Python 3, how do I rewrite the comparator using only key?
import functools def my_cmp(x, y): return x*5-y*2 l = [50, 2, 1, 9] print(sorted(l, key=functools.cmp_to_key(my_cmp)))
thanks.
-
Niner almost 9 yearsThe "sorting" function is irrelevant here. it's just for demo purposes, it could very well be (x*3.14+y*2.71) . I just want to know what the proper way is to write a "cmp" function in Python 3 that can compare 2 values.
-
Nir Alfasi almost 9 years@Niner see the last paragraph.
-
Niner almost 9 yearsSo in your given example of my_kye(x), how do I factor in my custom "fake" formula of (x*5-y*2)? Where is the "Y" is this case?
-
Nir Alfasi almost 9 years@Niner as I said - your example cannot be used since it doesn't include deterministic order. I know it can get a bit confusing but please read the last paragraph again to see why you don't need that second argument (y).
-
Luke Dupin about 8 yearsIts not confusing, python sorting is just limited, thus crappy. There is nothing wrong with sorting values that can't be "measured", that only have an order when compared to another. To say the only type of sorting that anyone could ever want to do is done by objects with a deterministic fitness value is silly. A much better answer to this question is, python doesn't support what you're trying to do.
-
Nir Alfasi about 8 years@LukeDupin I humbly disagree. There is a relation called "partial ordering" and using
cmp
"supports" it while the newerkey
approach forces the developer to think it carefully through and provide linear ordering. Forcing the user (developer) to do so prevents nasty bugs. Python language developers made a good choice by moving to linear ordering. -
Luke Dupin about 8 yearsSimple use case; you have an object, with 4 variables. 3 are integers, call them (scope, type, status), the last variable is a name, a string. There is nothing wrong with wanting to sort by groups of scopes then types then status and then alphabetical. This can be clearly written and maintained with a cmp. A key however requires hacks, or multiple levels of arrays/hashes to accomplish it. I've got nothing wrong with key, I have a huge issue with removing cmp. I appreciate your humble opinion, but key is limited, not in a healthy way.
-
Nir Alfasi about 8 years@LukeDupin key is not limited at all! it's true that key doesn't make your life easier - I never said so, what I did write is that it forces you to write better code because it forces you to think the ordering through. The example in the question shows it best - it might look right, but the OP ran into nondeterministic behavior because it does not define linear ordering well. This kind of behavior would be revealed with cmp only during run-time when the damage might have already been done...
-
Nir Alfasi about 8 yearsAnd btw, your example can be easily and clearly be implemented with key as well :)
-
Luke Dupin about 8 yearsOnly if you make assumptions on the maximum value for each category, which is sure to break later since a search algo shouldn't have that kind of deep knowledge about what its sorting. We are going to have to disagree on this subject. I will add, your arrogance and that of the python's team in this area of "I know every use case anyone should have" is what turns frameworks and languages into junk. In the mean time, I'm going to find a more flexible search algo or write my own qsort.
-
Nir Alfasi about 8 years@LukeDupin no you don't have to assume anything on the maximum value for each category. I'm sorry that you feel so emotional about it and that you're taking it to a personal level; a good discussion should remain professional and to the point. I truly wish you all the best!
-
Luke Dupin about 8 yearsPost an example solution to the problem I outlined then.
-
Nir Alfasi about 8 years@LukeDupin
key=lambda x: x.scope, x.type, x.status, x.name
It returns a tuple with the fields by the order you want them sorted and does alphabeitically (dictionary) comparison between the tuples. Nice right ? -
Luke Dupin about 8 years@alfasin whether out of stubbornness or on principle, I still wont concede cmp isn't needed, that is however a very nice solution. I didn't realize you could send tuples of elements into the key like that. Thank you.
-
Nir Alfasi about 8 years@LukeDupin you're most welcome and you're certainly entitled to have your opinion! if you can come up with an example/use-case where cmp can do something that key doesn't (or, that cmp can do easily something that key will require much more work), please post it here.
-
Mad Physicist over 7 years@LukeDupin. Take a look at the
natsort
library for a prime example on how to construct keys that might more easily work withcmp
. -
Luke Dupin over 7 years@MadPhysicist Thanks!
-
Mad Physicist over 7 years@LukeDupin. No problem. Turns out natsort does exactly the same thing that @alfasin suggests: splits the string into a tuple. I was originally under the impression that the library did something much more magical. Now I am 100% with you in that I would like to have
cmp
instead of, or at least as an alternative tokey
. -
Nir Alfasi over 7 years@MadPhysicist your logic is not clear to me: you're saying that
key
is doing easily something that you'll have to work harder implementing withcmp
and you still prefercmp
? :) -
Mad Physicist over 7 years@alfasin. Not at all. For a case like
natsort
, usingkey
is the right thing to do. I am just saying that even withtuple
comparison,key
is not always going to be an adequate way to express ordering criteria. -
Mad Physicist over 7 years@alfasin. Not sure if this is an example, but what if you wanted to sort ascending on the first element of the tuple and descending on the second?
-
Nir Alfasi over 7 years@MadPhysicist then I would do exactly what I did in the comments above but add a minus sign before the relevant element
-
Mad Physicist over 7 years@alfasin. How would you handle non-numerical elements?
-
Nir Alfasi over 7 years@MadPhysicist by providing a function that rates these items.