Sort the top ten results

12,183

Solution 1

Sort the list first and then slice it:

>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> print sorted(lis, key = lambda x : x[1], reverse = True)[:10] #[:10] returns first ten items
[['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]

To get data in list form from that file use this:

with open('abc') as f:
    next(f)  #skip header 
    lis = [[city,float(val)]  for city, val in( line.split() for line in f)]
    print lis 
    #[['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]  

Update:

new_lis = sorted(sc_percentage, key = lambda x : x[1], reverse = True)[:10]
for item in new_lis:
   print item

sorted returns a new sorted list, as we need to sort the list based on the second item of each element so we used the key parameter.

key = lambda x : x[1] means use the value on the index 1(i.e 100.0, 75.0 etc) of each item for comparison.

reverse= True is used for reverse sorting.

Solution 2

If the list is fairly short then as others have suggested you can sort it and slice it. If the list is very large then you may be better using heapq.nlargest():

>>> import heapq
>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> heapq.nlargest(2, lis, key=lambda x:x[1])
[['Mumbai', 98.3], ['London', 23.23]]

The difference is that nlargest only makes a single pass through the list and in fact if you are reading from a file or other generated source need not all be in memory at the same time.

You might also be interested to look at the source for nlargest() as it works in much the same way that you were trying to solve the problem: it keeps only the desired number of elements in a data structure known as a heap and each new value is pushed into the heap then the smallest value is popped from the heap.

Edit to show comparative timing:

>>> import random
>>> records = []
>>> for i in range(100000):
    value = random.random() * 100
    records.append(('city {:2.4f}'.format(value), value))


>>> import heapq
>>> heapq.nlargest(10, records, key=lambda x:x[1])
[('city 99.9995', 99.99948904248298), ('city 99.9974', 99.99738898315216), ('city 99.9964', 99.99642759230214), ('city 99.9935', 99.99345173704319), ('city 99.9916', 99.99162694442714), ('city 99.9908', 99.99075084123544), ('city 99.9887', 99.98865134685201), ('city 99.9879', 99.98792632193258), ('city 99.9872', 99.98724339718686), ('city 99.9854', 99.98540548350132)]
>>> timeit.timeit('sorted(records, key=lambda x:x[1])[:10]', setup='from __main__ import records', number=10)
1.388942152229788
>>> timeit.timeit('heapq.nlargest(10, records, key=lambda x:x[1])', setup='import heapq;from __main__ import records', number=10)
0.5476185073315492

On my system getting the top 10 from 100 records is fastest by sorting and slicing, but with 1,000 or more records it is faster to use nlargest.

Solution 3

You have to convert your input into something Python can handle easily:

with open('input.txt') as inputFile:
    lines = inputFile.readLines()
records = [ line.split() for line in lines ]
records = [ float(percentage), city for city, percentage in records ]

Now the records contain a list of the entries like this:

[ [ 98.3, 'Mumbai' ], [ 23.23, 'London' ], [ 12.22, Agra ] ]

You can sort that list in-place:

records.sort()

You can print the top ten by slicing:

print records[0:10]

If you have a huge list (e. g. millions of entries) and just want the top ten of these in a sorted way, there are better ways than sorting the whole list (which would be a waste of time then).

Solution 4

For printing the top 10 cities you can use :

Sort the list first and then slice it:

>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> [k[0] for k in sorted(lis, key = lambda x : x[1], reverse = True)[:10]]
    ['Mumbai', 'London', 'Agra']

For the given list

 >>>: lis=[("<ServiceCenter: DELHI-DLC>", 100.0),("<ServiceCenter: DELHI-DLW>", 92.307692307692307),("<ServiceCenter: DELHI-DLE>", 75.0),("<ServiceCenter: DELHI-DLN>", 90.909090909090907),("<ServiceCenter: DELHI-DLS>", 83.333333333333343)]

 >>>:t=[k[0] for k in sorted(lis, key = lambda x : x[1], reverse = True)[:10]]
 >>>:print t
['<ServiceCenter: DELHI-DLC>',
'<ServiceCenter: DELHI-DLW>',
'<ServiceCenter: DELHI-DLN>',
'<ServiceCenter: DELHI-DLS>',
'<ServiceCenter: DELHI-DLE>']

Sorted function returns the sorted list with key as the compare function .

Share:
12,183

Related videos on Youtube

onkar
Author by

onkar

"Do what you like,like what you do"

Updated on June 22, 2022

Comments

  • onkar
    onkar almost 2 years

    I am getting a list in which I am saving the results in the following way

    City Percentage
    Mumbai  98.30
    London 23.23
    Agra    12.22
    .....
    

    List structure is [["Mumbai",98.30],["London",23.23]..]

    I am saving this records in form of a list.I need the list to be sort top_ten records.Even if I get cities also, it would be fine.

    I am trying to use the following logic, but it fails for to provide accurate data

    if (condition):
        if b not in top_ten:
            top_ten.append(b)   
            top_ten.remove(tmp)
    

    Any other solution,approach is also welcome.

    EDIT 1

    for a in sc_percentage:
                print a
    

    List I am getting

    (<ServiceCenter: DELHI-DLC>, 100.0)
    (<ServiceCenter: DELHI-DLE>, 75.0)
    (<ServiceCenter: DELHI-DLN>, 90.909090909090907)
    (<ServiceCenter: DELHI-DLS>, 83.333333333333343)
    (<ServiceCenter: DELHI-DLW>, 92.307692307692307)
    
  • onkar
    onkar almost 11 years
    The list is calculated locally it self. Please can you simplify the code a bit.
  • onkar
    onkar almost 11 years
    I am using eclipse. The code is not working . sorted(sc_percentage, key = lambda x : x[1], reverse = True)[:10]
  • Ashwini Chaudhary
    Ashwini Chaudhary almost 11 years
    @onkar Please post sc_percentage in question body.
  • onkar
    onkar almost 11 years
    I am unable to get revised list
  • onkar
    onkar almost 11 years
    Hey I am using eclipse environment and a newbie in python development. Please can you modify the code and make it more simpler. Thank you for understanding
  • tusharmakkar08
    tusharmakkar08 almost 11 years
    Can you please show me the output of the list instead of printing the individual elements ? That is use print sc_percentage instead of printing using for loop .
  • Ashwini Chaudhary
    Ashwini Chaudhary almost 11 years
    @onkar and what does type(sc_percentage) prints?
  • onkar
    onkar almost 11 years
    @AshwiniChaudhary I did not get any error, I am unable to see the list. Also please check the edit
  • Ashwini Chaudhary
    Ashwini Chaudhary almost 11 years
    @onkar I've updated my answer, let me know if it works then I'll add some explanation.
  • tusharmakkar08
    tusharmakkar08 almost 11 years
    @onkar You need to add print "new list" explicitly to get the output
  • tusharmakkar08
    tusharmakkar08 almost 11 years
    @AshwiniChaudhary : I was able to visualize the .gif image you mentioned in your profile when your answer got downvoted . :D
  • Ashwini Chaudhary
    Ashwini Chaudhary almost 11 years
    @TusharMakkar Yes it sucks when someone downvotes for no valid reason.
  • onkar
    onkar almost 11 years
    @AshwiniChaudhary dont worry you have no downvotes for dis answer anymore ;)