Pandas Fillna of Multiple Columns with Mode of Each Column
15,097
Solution 1
If you want to impute missing values with the mode
in some columns a dataframe df
, you can just fillna
by Series
created by select by position by iloc
:
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
Or:
df[cols]=df[cols].fillna(mode.iloc[0])
Your solution:
df[cols]=df.filter(cols).fillna(mode.iloc[0])
Sample:
df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan],
'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'],
'col':[2,3,7,8,9]})
print (df)
col native-country workclass
0 2 United-States Private
1 3 NaN Private
2 7 Canada NaN
3 8 NaN another
4 9 United-States NaN
mode = df.filter(["workclass", "native-country"]).mode()
print (mode)
workclass native-country
0 Private United-States
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
print (df)
col native-country workclass
0 2 United-States Private
1 3 United-States Private
2 7 Canada Private
3 8 United-States another
4 9 United-States Private
Solution 2
You can do it like that:
df[["workclass", "native-country"]]=df[["workclass", "native-country"]].fillna(value=mode.iloc[0])
For example,
import pandas as pd
d={
'key3': [1,4,4,4,5],
'key2': [6,6,4],
'key1': [6,4,4],
}
df=pd.DataFrame.from_dict(d,orient='index').transpose()
Then df
is
key3 key2 key1
0 1 6 6
1 4 6 4
2 4 4 4
3 4 NaN NaN
4 5 NaN NaN
Then by doing:
l=df.filter(["key1", "key2"]).mode()
df[["key1", "key2"]]=df[["key1", "key2"]].fillna(value=l.iloc[0])
we get that df
is
key3 key2 key1
0 1 6 6
1 4 6 4
2 4 4 4
3 4 6 4
4 5 6 4
Author by
Nick
Updated on July 25, 2022Comments
-
Nick almost 2 years
Working with census data, I want to replace NaNs in two columns ("workclass" and "native-country") with the respective modes of those two columns. I can get the modes easily:
mode = df.filter(["workclass", "native-country"]).mode()
which returns a dataframe:
workclass native-country 0 Private United-States
However,
df.filter(["workclass", "native-country"]).fillna(mode)
does not replace the NaNs in each column with anything, let alone the mode corresponding to that column. Is there a smooth way to do this?
-
Mactilda over 4 yearsWhen I do this I get the following message:``` /anaconda3/envs/exts-ml/lib/python3.6/site-packages/pandas/core/frame.py:4024: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame```