pandas read_csv column dtype is set to decimal but converts to string
16,063
I think you need converters:
import pandas as pd
import io
import decimal as D
temp = u"""a,b,c,d
1,1,1,1.0"""
# after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
dtype={'a': int, 'b': float},
converters={'c': D.Decimal, 'd': D.Decimal})
print (df)
a b c d
0 1 1.0 1 1.0
for i, v in df.iterrows():
print(type(v.a), type(v.b), type(v.c), type(v.d))
<class 'int'> <class 'float'> <class 'decimal.Decimal'> <class 'decimal.Decimal'>
Related videos on Youtube
Author by
candleford
Updated on June 05, 2022Comments
-
candleford almost 2 years
I am using pandas (v0.18.1) to import the following data from a file called 'test.csv':
a,b,c,d 1,1,1,1.0
I have set the dtype to 'decimal.Decimal' for columns 'c' and 'd' but instead they return as type 'str'.
import pandas as pd import decimal as D df = pd.read_csv('test.csv', dtype={'a': int, 'b': float, 'c': D.Decimal, 'd': D.Decimal}) for i, v in df.iterrows(): print(type(v.a), type(v.b), type(v.c), type(v.d))
Results:
`<class 'int'> <class 'float'> <class 'str'> <class 'str'>`
I have also tried converting to decimal explicitly after import with no luck (converting to float works but not decimal).
df.c = df.c.astype(float) df.d = df.d.astype(D.Decimal) for i, v in df.iterrows(): print(type(v.a), type(v.b), type(v.c), type(v.d))
Results:
`<class 'int'> <class 'float'> <class 'float'> <class 'str'>`
The following code converts a 'str' to 'decimal.Decimal' so I don't understand why pandas doesn't behave the same way.
x = D.Decimal('1.0') print(type(x))
Results:
`<class 'decimal.Decimal'>`
-
Jan Christoph Terasa almost 8 yearsThe
pandas
documentation is hilariously unspecific about what adtype
is, but since I assume the implementation inpandas
is based onnumpy
, we luckily havenumpy
docs. Do keep in mind that using generic objects can be more inefficient performance- and memory-wise than using basicint
andfloat
.