selecting a particular row from groupby object in python
13,139
I cobbled this together using this: Python : Getting the Row which has the max value in groups using groupby
So basically we can groupby the 'id' column, then call transform
on the 'year' column and create a boolean index where the year matches the max year value for each 'id':
In [103]:
df[df.groupby(['id'])['year'].transform(max) == df['year']]
Out[103]:
id marks year
0 1 18 2013
2 3 16 2014
4 1 19 2013
6 2 18 2014
Author by
Shiva Prakash
Data Science enthusiast with hands on experience in Python, R, SQL and Tableau.
Updated on July 18, 2022Comments
-
Shiva Prakash almost 2 years
id marks year 1 18 2013 1 25 2012 3 16 2014 2 16 2013 1 19 2013 3 25 2013 2 18 2014
suppose now I group the above on id by python command.
grouped = file.groupby(file.id)I would like to get a new file with only the row in each group with recent year that is highest of all the year in the group.
Please let me know the command, I am trying with apply but it ll only given the boolean expression. I want the entire row with latest year.