pandas to_csv arguments float_format and decimal not working for index column
I would rewrite your two bottom lines:
with open('foo.csv', 'w') as f:
data.to_csv(f, index=True, header=True, decimal=',', sep=' ', float_format='%.3f')
Into
data.reset_index().to_csv('foo.csv', index=False, header=True, decimal=',', sep=' ', float_format='%.3f')
This is a bit of a workaround, but as you have noticed, the keyword arguments decimal=
and float_format=
only work on data columns, not on the index.
What I do instead is to put the index into the dataframe with reset_index
and then I tell to_csv(index=False
not to save the index to the file (since it is now in the data).
Also, opening a file stream yourself (with open('foo.csv', 'w') as f:
) is better left to pandas, which does this by itself when you just give it a string 'foo.csv'
as first argument.
Related videos on Youtube
albert
Chemical Engineer / Process Engineer - Research Assistant - Doctoral Candidate - Digital Transformation Enthusiast
Updated on January 16, 2020Comments
-
albert over 4 years
Background
I am doing some simulations resp. a system analysis by variing parameters (in this case
rpm
only) and append every last line of a results dataframeresults_df
to a summarizing dataframedf
containing giving the baviour of my system in depencence of the variedrpm
.In order to get an appropriate index for plotting and data analysis I converted the varied values (here
rpm
) from the list into a pandas seriesser
and concat this series with the summarizing dataframedf
containing the results I am interested in.Since the results of each calculation I am interested in is only last line of each calculation I am extracting this data from the results dataframe
results_df
by using.tail(1)
.What I have done so far is shown in the following snippet:
rpm = [0.25, 0.3, 0.5, 0.75, 1.0, 1.5, 2.0] ser = pd.Series(rpm, name='rpm') df = pd.DataFrame() df_list = list() for i, val in enumerate(rpm): results_df = get_some_data_from_somwhere() df_list.append(results_df.tail(1)) df = df.append(df_list, ignore_index=True) df = pd.concat([df, ser], axis=1) df.set_index('rpm', inplace=True) with open('foo.csv', 'w') as f: data.to_csv(f, index=True, header=True, decimal=',', sep=' ', float_format='%.3f')
Problem
This csv-file what I get has the follwing format:
rpm cooling_inner heating_inner cooling_outlet heating_outlet 0.25 303,317 323,372 302,384 324,332
However, I expected having three decimal digits and a comma as decimal sign on my index column, like shown here:
rpm cooling_inner heating_inner cooling_outlet heating_outlet 0,250 303,317 323,372 302,384 324,332
So it seems that the
index
anddecimal
sign options are not applied to the index column when exporting dataframes to csv-files using the.to_csv
command.How could I achieve this behaviour since the
index
option is setTrue
and all values (with exception to the index column) have the right format and decimal sign?Do I have to handle the index column somehow seperate?
-
firelynx almost 9 yearsA workaround is to do
data.reset_index().to_csv(index=False
... -
firelynx almost 9 yearsAlso, you can do
data.to_csv('foo.csv'
... you don't need to give it a stream. -
albert almost 9 years@firelynx: Is the file closed automatically after exporting data? This is somehow relevant since I exceeded the maximum number of opened files a few months ago...
-
firelynx almost 9 yearsYes, it closes the file after itself. I think it is generally safer to let pandas deal with the file handling, since then the logic is kept in one place, not in all places you do
.to_csv
-
firelynx almost 9 yearsWrote my two points as a proper answer instead with a bit more elaboration.
-