pandas.to_dict returns None mixed with nan

python-3.x pandas

11,562

I think you can only replace, it is not possible control in to_dict:

L = [{
     "COL1":"VAL1"
     ,"COL2":np.nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":np.nan
}]

df = pd.DataFrame(L).replace({np.nan:None})
print (df)
   COL1  COL2  COL3
0  VAL1  None  VAL3
1  None  VAL2  VAL3
2  VAL1  VAL2  None

print (df.to_dict(orient='records'))
[{'COL3': 'VAL3', 'COL2': None, 'COL1': 'VAL1'}, 
 {'COL3': 'VAL3', 'COL2': 'VAL2', 'COL1': None}, 
 {'COL3': None, 'COL2': 'VAL2', 'COL1': 'VAL1'}]

11,562

Piotr Kamoda

I'm currently working in Warsaw, mostly in various dbms and Informatica, trying to work my way around some ridiculous tasks.

Updated on June 04, 2022

Comments

Piotr Kamoda almost 2 years
I've stumbled upon a minor problem with pandas and it's method to_dict. I have a table that I'm certain have equal number of identical columns in each row, let's say it looks like that:
```
+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1|    |VAL3|
|    |VAL2|VAL3|
|VAL1|VAL2|    |
+----|----|----+
```
When I do df.to_dict(orient='records') I get:
```
[{
     "COL1":"VAL1"
     ,"COL2":nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":nan
}]
```
Notice nan's in some columns and None's in other (always the same, there appears to be no nan and None in same column)

And when I do json.loads(df.to_json(orient='records')) i get only None and no nan's (which is desired output).

Like this:
```
[{
     "COL1":"VAL1"
     ,"COL2":None
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":None
}]
```
I would appreciate some explanation as to why it happens and if it can be controlled in some way.

==EDIT==

According to comments it would be better to first replace those nan's with None's, but those nan's are not np.nan:
```
>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False
```
Piotr Kamoda about 7 years

Unfortunately this does not work :P this np.NaN is not equal to what I have in the table (which is strange)
jezrael about 7 years

VALs are numeric or strings ?
Piotr Kamoda about 7 years

string - data is loaded from json
jezrael about 7 years

Hmm, problem is in real data or with sample too? What is your version of python and pandas?
Piotr Kamoda about 7 years

pandas == '0.19.2' python == Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32
jezrael about 7 years

It is weird, I use same version of python and pandas nad for me it works. :(
jezrael about 7 years

I see your edted question, need for check NaN something else print (pd.isnull(pd.Series(np.nan)))
Piotr Kamoda about 7 years

It outputs True
Piotr Kamoda about 7 years

Nope, unfortunately no. I even tried: >>> a = df.ix[0,60] >>> a nan >>> df.head().replace({a:None})
jezrael about 7 years

and this a = df.replace({np.nan:None}).ix[0,60] ? What is a ?
Piotr Kamoda about 7 years

>>> a = df.replace({np.nan:None}).ix[0,60] >>> a nan >>> type(a) <class 'numpy.float64'>
jezrael about 7 years

Hmmm, and what about print (df.to_json(orient='records')) ? - to_json
Piotr Kamoda about 7 years

>>> print (df.head().ix[:,60].to_json(orient='records')) [null,null,null,null,null] ¯\(°_o)/¯
jezrael about 7 years

Yes, you are right. I find this - Note NaN‘s, NaT‘s and None will be converted to null and datetime objects will be converted based on the date_format and date_unit parameters.. So if replace dont work, use only your solution :(