Output different precision by column with pandas.DataFrame.to_csv()?

python csv numpy floating-point pandas

51,798

Solution 1

Change the type of column "vals" prior to exporting the data frame to a CSV file

df_data['vals'] = df_data['vals'].map(lambda x: '%2.1f' % x)

df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')

Solution 2

The more current version of hknust's first line would be:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1}'.format(x))

To print without scientific notation:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1f}'.format(x))

Solution 3

This question is a bit old, but I'd like to contribute with a better answer, I think so:

formats = {'lats': '{:10.5f}', 'lons': '{:.3E}', 'vals': '{:2.1f}'}

for col, f in formats.items():
    df_data[col] = df_data[col].map(lambda x: f.format(x))

I tried with the solution here, but it didn't work for me, I decided to experiment with previus solutions given here combined with that from the link above.

Solution 4

You can use round method for dataframe before saving the dataframe to the file.

df_data = df_data.round(6)
df_data.to_csv('myfile.dat')

Solution 5

You can do this with to_string. There is a formatters argument where you can provide a dict of columns names to formatters. Then you can use some regexp to replace the default column separators with your delimiter of choice.

View more solutions

51,798

ryanjdillon

Updated on July 09, 2022

Comments

ryanjdillon almost 2 years

Question

Is it possible to specify a float precision specifically for each column to be printed by the Python pandas package method pandas.DataFrame.to_csv?

Background

If I have a pandas dataframe that is arranged like this:

In [53]: df_data[:5]
Out[53]: 
    year  month  day       lats       lons  vals
0   2012      6   16  81.862745 -29.834254   0.0
1   2012      6   16  81.862745 -29.502762   0.1
2   2012      6   16  81.862745 -29.171271   0.0
3   2012      6   16  81.862745 -28.839779   0.2
4   2012      6   16  81.862745 -28.508287   0.0

There is the float_format option that can be used to specify a precision, but this applys that precision to all columns of the dataframe when printed.

When I use that like so:

df_data.to_csv(outfile, index=False,
                   header=False, float_format='%11.6f')

I get the following, where vals is given an inaccurate precision:

2012,6,16,  81.862745, -29.834254,   0.000000
2012,6,16,  81.862745, -29.502762,   0.100000
2012,6,16,  81.862745, -29.171270,   0.000000
2012,6,16,  81.862745, -28.839779,   0.200000
2012,6,16,  81.862745, -28.508287,   0.000000

ryanjdillon over 10 years

This also seems like a good method. I wonder what is more efficient. Thanks!
mattexx over 10 years

I haven't tested to see which is more efficient, but I would have to guess this one since it does not modify the dataframe...
smci over 8 years

Note you're not just changing the type of vals, you're also rounding it. If it's not acceptable to modify the column, then could save it to a temporary column 'vals.numeric' while doing the to_csv() write.
Géraud over 6 years

I love this idea, and agree with your argumentation, but formatters argument is not available for df.to_csv function...
nealmcb over 6 years

@Géraud Thanks. That's too bad - maybe you can file an issue? But maybe they figure that rounding the data for purposes of a file export like csv makes less sense, and there are a bunch of ways to display csv files in tabular format with their own customizable ways of determining precision.
Nikhil VJ almost 6 years

hi @matlexx if would be great if you could elaborate on this. The .to_string() method merely converts the df to one concatenated string at my end. I can't see how one can send the output of this to .to_csv()
Nikhil VJ almost 6 years

this sets the numbers ok but it turns all the blanks in my column into 'nan' which makes its way to the csv also via to_csv and I'm not able to get rid of it.
Milind R over 5 years

@NikhilVJ I think .to_string() should be able to do all that .to_csv() does.
Matthijs Kramer over 5 years

@MilindR I just tried using formatters={"Cloud Cover": "%g"} in .to_csv() but got a TypeError: to_csv() got an unexpected keyword argument 'formatters'
Milind R over 5 years

@MatthijsKramer I did not imply that .to_csv() can do all that .to_string() does. Anyway, I tried out .to_string() and results in a single string with multi-space separators. I'm probably going the convert-to-string-type-columns-before-to_csv route for my work.
Matthijs Kramer over 5 years

@MilindR Reading your comment again I see I interpreted your statement the wrong way around... Either way, it would be nice if the to_csv method could use formatters.
daryl over 5 years

To avoid the nan issue, my approach is lambda x: '%2.1f % x if not pd.isna(x) else ''
ryanjdillon almost 4 years

The link you posted is for generated CSS for display in a Jupyter notebook, so if that won't work if you'd like your resulting CSV formatted. The chosen answer her that you've adapted converts all columns to (formatted) string data-types. for the format string, the pefered Python3.6+ way would be now be f"{x:2.1f}" rather than "%2.1f" % x.
Nacho almost 4 years

Now I fixed the problem. Thanks @ryanjdillon
esantix about 2 years

The problem with this is that values are changed to strings