Output different precision by column with pandas.DataFrame.to_csv()?
Solution 1
Change the type of column "vals" prior to exporting the data frame to a CSV file
df_data['vals'] = df_data['vals'].map(lambda x: '%2.1f' % x)
df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')
Solution 2
The more current version of hknust's first line would be:
df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1}'.format(x))
To print without scientific notation:
df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1f}'.format(x))
Solution 3
This question is a bit old, but I'd like to contribute with a better answer, I think so:
formats = {'lats': '{:10.5f}', 'lons': '{:.3E}', 'vals': '{:2.1f}'}
for col, f in formats.items():
df_data[col] = df_data[col].map(lambda x: f.format(x))
I tried with the solution here, but it didn't work for me, I decided to experiment with previus solutions given here combined with that from the link above.
Solution 4
You can use round method for dataframe before saving the dataframe to the file.
df_data = df_data.round(6)
df_data.to_csv('myfile.dat')
Solution 5
You can do this with to_string
. There is a formatters
argument where you can provide a dict of columns names to formatters. Then you can use some regexp to replace the default column separators with your delimiter of choice.
Related videos on Youtube
ryanjdillon
Updated on July 09, 2022Comments
-
ryanjdillon almost 2 years
Question
Is it possible to specify a float precision specifically for each column to be printed by the Python
pandas
package method pandas.DataFrame.to_csv?Background
If I have a
pandas
dataframe that is arranged like this:In [53]: df_data[:5] Out[53]: year month day lats lons vals 0 2012 6 16 81.862745 -29.834254 0.0 1 2012 6 16 81.862745 -29.502762 0.1 2 2012 6 16 81.862745 -29.171271 0.0 3 2012 6 16 81.862745 -28.839779 0.2 4 2012 6 16 81.862745 -28.508287 0.0
There is the
float_format
option that can be used to specify a precision, but this applys that precision to all columns of the dataframe when printed.When I use that like so:
df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')
I get the following, where
vals
is given an inaccurate precision:2012,6,16, 81.862745, -29.834254, 0.000000 2012,6,16, 81.862745, -29.502762, 0.100000 2012,6,16, 81.862745, -29.171270, 0.000000 2012,6,16, 81.862745, -28.839779, 0.200000 2012,6,16, 81.862745, -28.508287, 0.000000
-
ryanjdillon over 10 yearsThis also seems like a good method. I wonder what is more efficient. Thanks!
-
mattexx over 10 yearsI haven't tested to see which is more efficient, but I would have to guess this one since it does not modify the dataframe...
-
smci over 8 yearsNote you're not just changing the type of vals, you're also rounding it. If it's not acceptable to modify the column, then could save it to a temporary column 'vals.numeric' while doing the to_csv() write.
-
Géraud over 6 yearsI love this idea, and agree with your argumentation, but
formatters
argument is not available fordf.to_csv
function... -
nealmcb over 6 years@Géraud Thanks. That's too bad - maybe you can file an issue? But maybe they figure that rounding the data for purposes of a file export like csv makes less sense, and there are a bunch of ways to display csv files in tabular format with their own customizable ways of determining precision.
-
Nikhil VJ almost 6 yearshi @matlexx if would be great if you could elaborate on this. The
.to_string()
method merely converts the df to one concatenated string at my end. I can't see how one can send the output of this to.to_csv()
-
Nikhil VJ almost 6 yearsthis sets the numbers ok but it turns all the blanks in my column into 'nan' which makes its way to the csv also via to_csv and I'm not able to get rid of it.
-
Milind R over 5 years@NikhilVJ I think
.to_string()
should be able to do all that.to_csv()
does. -
Matthijs Kramer over 5 years@MilindR I just tried using formatters={"Cloud Cover": "%g"} in .to_csv() but got a
TypeError: to_csv() got an unexpected keyword argument 'formatters'
-
Milind R over 5 years@MatthijsKramer I did not imply that
.to_csv()
can do all that.to_string()
does. Anyway, I tried out.to_string()
and results in a single string with multi-space separators. I'm probably going the convert-to-string-type-columns-before-to_csv route for my work. -
Matthijs Kramer over 5 years@MilindR Reading your comment again I see I interpreted your statement the wrong way around... Either way, it would be nice if the to_csv method could use formatters.
-
daryl over 5 yearsTo avoid the nan issue, my approach is
lambda x: '%2.1f % x if not pd.isna(x) else ''
-
ryanjdillon almost 4 yearsThe link you posted is for generated CSS for display in a Jupyter notebook, so if that won't work if you'd like your resulting CSV formatted. The chosen answer her that you've adapted converts all columns to (formatted) string data-types. for the format string, the pefered Python3.6+ way would be now be
f"{x:2.1f}"
rather than"%2.1f" % x
. -
Nacho almost 4 yearsNow I fixed the problem. Thanks @ryanjdillon
-
esantix about 2 yearsThe problem with this is that values are changed to strings