Pandas: get first 10 elements of a series

25,377

IIUC you can use:

from itertools import chain 

#flat nested lists
a = list(chain.from_iterable(df['tfidf_sorted']))
#sorting
a.sort(key=lambda x: x[1], reverse=True)
#get 10 top
print (a[:10])

Or if need top 10 per row add [:10]:

df['tfidf_sorted'] = df['tfidf'].apply(lambda y: (sorted(y.items(), key=lambda x: x[1], reverse=True))[:10])
Share:
25,377
chintan s
Author by

chintan s

Updated on October 05, 2020

Comments

  • chintan s
    chintan s over 3 years

    I have a data frame with a column tfidf_sorted as follows:

       tfidf_sorted
    
    0  [(morrell, 45.9736796), (football, 25.58352014...
    1  [(melatonin, 48.0010051405), (lewy, 27.5842077...
    2  [(blues, 36.5746634797), (harpdog, 20.58669641...
    3  [(lem, 35.1570832476), (rottensteiner, 30.8800...
    4  [(genka, 51.4667410433), (legendaarne, 30.8800...
    

    The type(df.tfidf_sorted) returns pandas.core.series.Series.

    This column was created as follows:

    df['tfidf_sorted'] = df['tfidf'].apply(lambda y: sorted(y.items(), key=lambda x: x[1], reverse=True))
    

    where tfidf is a dictionary.

    How do I get the first 10 key-value pairs from tfidf_sorted?

  • chintan s
    chintan s over 7 years
    Thanks!. The second answer worked. For the first one, do I need to import a library?
  • jezrael
    jezrael over 7 years
    Yes, I add it to answer. But first answer return top 10 of all values in all rows.
  • chintan s
    chintan s over 7 years
    Thanks. Second answer is what I was looking for.