Create dataframe with row name, column name, and max column value

12,377

Solution 1

Use numpy.where for indices for matched maxes and create new DataFrame by indexing - better if performance is important in large DataFrame:

c, r = np.where(df.eq(df.max()).T)
df = pd.DataFrame({'idx':df.index[r], 'cols':df.columns[c], 'vals': df.values[r, c]})
print(df)
     idx     cols  vals
0  row_3     dogs    86
1  row_9     dogs    86
2  row_4     cats    80
3  row_8    sheep    89
4  row_6  lizards    88
5  row_7  lizards    88
6  row_1    birds    82

Another only pandas solution with DataFrame.unstack and GroupBy.transform for compare max values per groups by first level:

s = df.unstack()
df = s[s.groupby(level=0).transform('max').eq(s)].reset_index(name='vals')
print(df)
   animals   Rows  vals
0     dogs  row_3    86
1     dogs  row_9    86
2     cats  row_4    80
3    sheep  row_8    89
4  lizards  row_6    88
5  lizards  row_7    88
6    birds  row_1    82

Solution 2

You may check with using .where and eq to mask the no max value as NaN then stack

df.where(df.eq(df.max())).stack().sort_index(level=1).reset_index()
    Rows  animals     0
0  row_3     dogs  86.0
1  row_9     dogs  86.0
2  row_4     cats  80.0
3  row_8    sheep  89.0
4  row_6  lizards  88.0
5  row_7  lizards  88.0
6  row_1    birds  82.0
Share:
12,377

Related videos on Youtube

MarkS
Author by

MarkS

Updated on June 04, 2022

Comments

  • MarkS
    MarkS almost 2 years

    I am going through a video package I bought from PACKT to learn pandas. The author used jijna2 style() to highlight the max value in each column. I quickly discovered that I can't use that technique in PyCharm. So I decided to extract the values.

    What I am trying to do is create a three column dataframe by extracting the row index, column name and max column value from a dataframe with N columns, and then create the new dataframe. The new dataframe would show each row (if there are ties show all appropriate rows), the column and the max value in that column.

    I created a toy dataframe just to work work through the code.

    My code is below, along with the output and, at the very bottom, is what I actually want the new dataframe to look like.

    I know I am using a print statement. That code is the only thing I have used so far that correctly picks up multiple rows if I have a tie.

    I grab the whole row, which I don't want. I also am not sure how to construct the proposed new dataframe from the extracted data.

    import pandas as pd
    
    
    raw_data = {
                'dogs': [42, 39, 86, 15, 23, 57, 68, 81, 86],
                'cats': [52, 41, 79, 80, 34, 47, 19, 22, 59],
                'sheep': [62, 37, 84, 51, 67, 32, 23, 89, 73],
                'lizards': [72, 43, 36, 26, 53, 88, 88, 34, 69],
                'birds': [82, 35, 77, 63, 18, 12, 45, 56, 58],
                }
    
    df = pd.DataFrame(raw_data,
                      index=pd.Index(['row_1', 'row_2', 'row_3', 'row_4', 'row_5', 'row_6', 'row_7', 'row_8', 'row_9'], name='Rows'),
                      columns=pd.Index(['dogs', 'cats', 'sheep', 'lizards', 'birds'], name='animals'))
    
    print(df)
    print()
    
    # Get a list of all columns names
    cols = df.columns
    print(cols)
    print('*****')
    
    for col in cols:
        print((df[df[col] == df[col].max()]))
    
    
    '''
    animals  dogs  cats  sheep  lizards  birds
    Rows                                      
    row_3      86    79     84       36     77
    row_9      86    59     73       69     58
    animals  dogs  cats  sheep  lizards  birds
    Rows                                      
    row_4      15    80     51       26     63
    animals  dogs  cats  sheep  lizards  birds
    Rows                                      
    row_8      81    22     89       34     56
    animals  dogs  cats  sheep  lizards  birds
    Rows                                      
    row_6      57    47     32       88     12
    row_7      68    19     23       88     45
    animals  dogs  cats  sheep  lizards  birds
    Rows                                      
    row_1      42    52     62       72     82
    '''
    
    row_3     dogs        86
    row_9     dogs        86
    row_4     cats        80
    row_8     sheep       89
    row_6     lizards     88
    row_7     lizards     88
    row_1     birds       82
    
  • BENY
    BENY about 5 years
    @jezrael not sure why happen again, I do vote for u