Python (Pandas) error 'the label [Algeria] is not in the [index]'
Your problem is boolean indexing
:
df[(df['Gold']>0) & (df['Gold.1']>0)]
returns a filtered DataFrame which does not contain the index
of max
value of Series
you calculated with this:
((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()
In your data it is Algeria
.
So loc
logically throws a KeyError
.
One possible solution is to assign the new filtered DataFrame
to df1
and then get the index corresponding to the max value of Series
by using idxmax
:
df1 = df[(df['Gold']>0) & (df['Gold.1']>0)]
df2 = df1.loc[((df1['Gold']-df1['Gold.1'])/(df1['Gold']+df1['Gold.1']+df1['Gold.2'])).abs().idxmax()]
YohanRoth
Updated on June 04, 2022Comments
-
YohanRoth almost 2 years
I do not understand why this works
df[(df['Gold']>0) & (df['Gold.1']>0)].loc[((df['Gold'] - df['Gold.1'])/(df['Gold'])).abs().idxmax()]
but when I divide by
(df['Gold'] + df['Gold.1'] + df['Gold.2'])
it stops working giving me error that you can find below.Interestingly, the following line works
df.loc[((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()]
I do not understand what is happening since I just started to learn Python and Pandas. I need to understand the reason why this happens and how to fix it.
ERROR
KeyError: 'the label [Algeria] is not in the [index]'
-
YohanRoth over 7 yearsI did not really get this "return df which not contains index of max value of Series:" So you are saying max value is not in data frame that is returned after boolean operation? I though we first perform boolean filter, then on what's filtered we find max value. Isn't it how it works?
-
jezrael over 7 yearsNo, because although you filter it, you dont use filtered values in
((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()
but original unfiltered. Btw, this is very hard debugging error, because sometimes it works nice - if filtered dataframe contains idxmax, but sometimes it failed if values are changed. IfAlgeria
return((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()
, you can seeGold.1==0
, so not(df['Gold.1']>0)
-
YohanRoth over 7 yearshmm, thanks. That is so weird. What's even a point to allow writing like this when it brings so subtle errors and it does not work the way expected. I expected it to be evaluated from left to right. Instead it works so weirdly :( Anyway, thanks!