Output different precision by column with pandas.DataFrame.to_csv()?

51,798

Solution 1

Change the type of column "vals" prior to exporting the data frame to a CSV file

df_data['vals'] = df_data['vals'].map(lambda x: '%2.1f' % x)

df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')

Solution 2

The more current version of hknust's first line would be:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1}'.format(x))

To print without scientific notation:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1f}'.format(x)) 

Solution 3

This question is a bit old, but I'd like to contribute with a better answer, I think so:

formats = {'lats': '{:10.5f}', 'lons': '{:.3E}', 'vals': '{:2.1f}'}

for col, f in formats.items():
    df_data[col] = df_data[col].map(lambda x: f.format(x))

I tried with the solution here, but it didn't work for me, I decided to experiment with previus solutions given here combined with that from the link above.

Solution 4

You can use round method for dataframe before saving the dataframe to the file.

df_data = df_data.round(6)
df_data.to_csv('myfile.dat')

Solution 5

You can do this with to_string. There is a formatters argument where you can provide a dict of columns names to formatters. Then you can use some regexp to replace the default column separators with your delimiter of choice.

Share:
51,798

Related videos on Youtube

ryanjdillon
Author by

ryanjdillon

Updated on July 09, 2022

Comments

  • ryanjdillon
    ryanjdillon almost 2 years

    Question

    Is it possible to specify a float precision specifically for each column to be printed by the Python pandas package method pandas.DataFrame.to_csv?

    Background

    If I have a pandas dataframe that is arranged like this:

    In [53]: df_data[:5]
    Out[53]: 
        year  month  day       lats       lons  vals
    0   2012      6   16  81.862745 -29.834254   0.0
    1   2012      6   16  81.862745 -29.502762   0.1
    2   2012      6   16  81.862745 -29.171271   0.0
    3   2012      6   16  81.862745 -28.839779   0.2
    4   2012      6   16  81.862745 -28.508287   0.0
    

    There is the float_format option that can be used to specify a precision, but this applys that precision to all columns of the dataframe when printed.

    When I use that like so:

    df_data.to_csv(outfile, index=False,
                       header=False, float_format='%11.6f')
    

    I get the following, where vals is given an inaccurate precision:

    2012,6,16,  81.862745, -29.834254,   0.000000
    2012,6,16,  81.862745, -29.502762,   0.100000
    2012,6,16,  81.862745, -29.171270,   0.000000
    2012,6,16,  81.862745, -28.839779,   0.200000
    2012,6,16,  81.862745, -28.508287,   0.000000
    
  • ryanjdillon
    ryanjdillon over 10 years
    This also seems like a good method. I wonder what is more efficient. Thanks!
  • mattexx
    mattexx over 10 years
    I haven't tested to see which is more efficient, but I would have to guess this one since it does not modify the dataframe...
  • smci
    smci over 8 years
    Note you're not just changing the type of vals, you're also rounding it. If it's not acceptable to modify the column, then could save it to a temporary column 'vals.numeric' while doing the to_csv() write.
  • Géraud
    Géraud over 6 years
    I love this idea, and agree with your argumentation, but formatters argument is not available for df.to_csv function...
  • nealmcb
    nealmcb over 6 years
    @Géraud Thanks. That's too bad - maybe you can file an issue? But maybe they figure that rounding the data for purposes of a file export like csv makes less sense, and there are a bunch of ways to display csv files in tabular format with their own customizable ways of determining precision.
  • Nikhil VJ
    Nikhil VJ almost 6 years
    hi @matlexx if would be great if you could elaborate on this. The .to_string() method merely converts the df to one concatenated string at my end. I can't see how one can send the output of this to .to_csv()
  • Nikhil VJ
    Nikhil VJ almost 6 years
    this sets the numbers ok but it turns all the blanks in my column into 'nan' which makes its way to the csv also via to_csv and I'm not able to get rid of it.
  • Milind R
    Milind R over 5 years
    @NikhilVJ I think .to_string() should be able to do all that .to_csv() does.
  • Matthijs Kramer
    Matthijs Kramer over 5 years
    @MilindR I just tried using formatters={"Cloud Cover": "%g"} in .to_csv() but got a TypeError: to_csv() got an unexpected keyword argument 'formatters'
  • Milind R
    Milind R over 5 years
    @MatthijsKramer I did not imply that .to_csv() can do all that .to_string() does. Anyway, I tried out .to_string() and results in a single string with multi-space separators. I'm probably going the convert-to-string-type-columns-before-to_csv route for my work.
  • Matthijs Kramer
    Matthijs Kramer over 5 years
    @MilindR Reading your comment again I see I interpreted your statement the wrong way around... Either way, it would be nice if the to_csv method could use formatters.
  • daryl
    daryl over 5 years
    To avoid the nan issue, my approach is lambda x: '%2.1f % x if not pd.isna(x) else ''
  • ryanjdillon
    ryanjdillon almost 4 years
    The link you posted is for generated CSS for display in a Jupyter notebook, so if that won't work if you'd like your resulting CSV formatted. The chosen answer her that you've adapted converts all columns to (formatted) string data-types. for the format string, the pefered Python3.6+ way would be now be f"{x:2.1f}" rather than "%2.1f" % x.
  • Nacho
    Nacho almost 4 years
    Now I fixed the problem. Thanks @ryanjdillon
  • esantix
    esantix about 2 years
    The problem with this is that values are changed to strings