Python (Pandas) error 'the label [Algeria] is not in the [index]'

11,344

Your problem is boolean indexing:

df[(df['Gold']>0) & (df['Gold.1']>0)]

returns a filtered DataFrame which does not contain the index of max value of Series you calculated with this:

((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()

In your data it is Algeria.

So loc logically throws a KeyError.

One possible solution is to assign the new filtered DataFrame to df1 and then get the index corresponding to the max value of Series by using idxmax:

df1 = df[(df['Gold']>0) & (df['Gold.1']>0)]
df2 = df1.loc[((df1['Gold']-df1['Gold.1'])/(df1['Gold']+df1['Gold.1']+df1['Gold.2'])).abs().idxmax()]
Share:
11,344
YohanRoth
Author by

YohanRoth

Updated on June 04, 2022

Comments

  • YohanRoth
    YohanRoth almost 2 years

    I do not understand why this works

    df[(df['Gold']>0) & (df['Gold.1']>0)].loc[((df['Gold'] - df['Gold.1'])/(df['Gold'])).abs().idxmax()]
    

    but when I divide by (df['Gold'] + df['Gold.1'] + df['Gold.2']) it stops working giving me error that you can find below.

    Interestingly, the following line works

    df.loc[((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()]
    

    I do not understand what is happening since I just started to learn Python and Pandas. I need to understand the reason why this happens and how to fix it.

    ERROR

    KeyError: 'the label [Algeria] is not in the [index]'

    DataFrame snap enter image description here

  • YohanRoth
    YohanRoth over 7 years
    I did not really get this "return df which not contains index of max value of Series:" So you are saying max value is not in data frame that is returned after boolean operation? I though we first perform boolean filter, then on what's filtered we find max value. Isn't it how it works?
  • jezrael
    jezrael over 7 years
    No, because although you filter it, you dont use filtered values in ((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax() but original unfiltered. Btw, this is very hard debugging error, because sometimes it works nice - if filtered dataframe contains idxmax, but sometimes it failed if values are changed. If Algeria return ((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax(), you can see Gold.1==0, so not (df['Gold.1']>0)
  • YohanRoth
    YohanRoth over 7 years
    hmm, thanks. That is so weird. What's even a point to allow writing like this when it brings so subtle errors and it does not work the way expected. I expected it to be evaluated from left to right. Instead it works so weirdly :( Anyway, thanks!