Create empty csv file with pandas
Solution 1
I would do it this way: first read up all your CSV files (but only the columns that you really need) into one DF, then make groupby(['Year','Month','Day']).mean()
and save resulting DF into CSV file:
import glob
import pandas as pd
fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Year','Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Year','Month','Day']).mean().to_csv('my_csv.csv')
and if want to ignore the year:
import glob
import pandas as pd
fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Month','Day']).mean().to_csv('my_csv.csv')
Some details:
(pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob('*.csv'))
will generate tuple of data frames from all your CSV files
pd.concat(...)
will concatenate them into resulting single DF
df.groupby(['Year','Month','Day']).mean()
will produce wanted report as a data frame, which might be saved into new CSV file:
.to_csv('my_csv.csv')
Solution 2
Just open the file in write mode to create it.
with open('my_csv.csv', 'w'):
pass
Anyway I do not think you should be opening and closing the file so many times. You'd better open the file once, write several times.
with open('my_csv.csv', 'w') as f:
for EachMonth in MonthsInAnalysis:
TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
df.to_csv(f, header=False)
Solution 3
Creating a blank csv file is as simple as this one
import pandas as pd
pd.DataFrame({}).to_csv("filename.csv")
PaulBarr
Updated on June 15, 2022Comments
-
PaulBarr almost 2 years
I am interacting through a number of csv files and want to append the mean temperatures to a blank csv file. How do you create an empty csv file with pandas?
for EachMonth in MonthsInAnalysis: TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth) MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature') with open('my_csv.csv', 'a') as f: df.to_csv(f, header=False)
So in the above code how do I create the
my_csv.csv
prior to thefor
loop?Just a note I know you can create a data frame then save the data frame to csv but I am interested in whether you can skip this step.
In terms of context I have the following csv files:
Each of which have the following structure:
The Day column reads up to 30 days for each file.
I would like to output a csv file that looks like this:
But obviously includes all the days for all the months.
My issue is that I don't know which months are included in each analysis hence I wanted to use a for loop that used a list that has that information in it to access the relevant csvs, calculate the mean temperature then save it all into one csv.
Input as text:
Unnamed: 0 AirTemperature AirHumidity SoilTemperature SoilMoisture LightIntensity WindSpeed Year Month Day Hour Minute Second TimeStamp MonthCategorical TimeOfDay 6 6 18 84 17 41 40 4 2016 1 1 6 1 1 10106 January Day 7 7 20 88 22 92 31 0 2016 1 1 7 1 1 10107 January Day 8 8 23 1 22 59 3 0 2016 1 1 8 1 1 10108 January Day 9 9 23 3 22 72 41 4 2016 1 1 9 1 1 10109 January Day 10 10 24 63 23 83 85 0 2016 1 1 10 1 1 10110 January Day 11 11 29 73 27 50 1 4 2016 1 1 11 1 1 10111 January Day