Replace a string value with NaN in pandas data frame - Python

28,083

Solution 1

I think you forget assign back:

data = pd.DataFrame([[1,'?',5],['?','?',4],['?',32.1,1]])

data = data.replace('?', np.nan)
#alternative
#data.replace('?', np.nan, inplace=True)
print (data)
     0     1  2
0  1.0   NaN  5
1  NaN   NaN  4
2  NaN  32.1  1

print (data.isnull())
       0      1      2
0  False   True  False
1   True   True  False
2   True  False  False

Solution 2

# a dataframe with string values
dat = pd.DataFrame({'a':[1,'FG', 2, 4], 'b':[2, 5, 'NA', 7]})

enter image description here

Removing non numerical elements from the dataframe:

"Method 1 - with regex"
dat2 = dat.replace(r'^([A-Za-z]|[0-9]|_)+$', np.NaN, regex=True)
dat2

enter image description here

"Method 2 - with pd.to_numeric"
dat3 = pd.DataFrame()
for col in dat.columns:
    dat3[col] = pd.to_numeric(dat[col], errors='coerce')
dat3

enter image description here

Solution 3

? is a not null. So you will expect to get a False under the isnull test

>>> data = pandas.DataFrame([[1,'?',5],['?','?',4],['?',32.1,1]])
>>> data
          0      1      2
   0  False  False  False
   1  False  False  False
   2  False  False  False

After you replace ? with NaN the test will look much different

>>> data = data.replace('?', np.nan)
>>> data
       0      1      2
0  False   True  False
1   True   True  False
2   True  False  False

Solution 4

I believe when you are doing pd.data.replace('?', np.nan) this action is not done in place, so you must try -

data = data.replace('?', np.nan)
Share:
28,083

Related videos on Youtube

stefanodv
Author by

stefanodv

Updated on October 08, 2021

Comments

  • stefanodv
    stefanodv about 2 years

    Do I have to replace the value? with NaN so you can invoke the .isnull () method. I have found several solutions but some errors are always returned. Suppose:

    data = pd.DataFrame([[1,?,5],[?,?,4],[?,32.1,1]])
    

    and if I try:

    pd.data.replace('?', np.nan)
    

    I have:

         0     1  2
    0  1.0   NaN  5
    1  NaN   NaN  4
    2  NaN  32.1  1    
    

    but data.isnull() returns:

           0      1      2
    0  False  False  False
    1  False  False  False
    2  False  False  False
    

    Why?