Aggregate time series in python

16,191

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.resample.html DataFrame.resample method. You can specify here way of aggregation, in your case sum.

data_frame.resample("1Min", how="sum")

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling

Share:
16,191
Admin
Author by

Admin

Updated on July 20, 2022

Comments

  • Admin
    Admin almost 2 years

    How do we aggregate the time series by hour or minutely granularity? If I have a time series like the following then I want the values to be aggregated by hour. Does pandas support it or is there a nifty way to do it in python?

    timestamp, value
    2012-04-30T22:25:31+00:00, 1
    2012-04-30T22:25:43+00:00, 1
    2012-04-30T22:29:04+00:00, 2
    2012-04-30T22:35:09+00:00, 4
    2012-04-30T22:39:28+00:00, 1
    2012-04-30T22:47:54+00:00, 8
    2012-04-30T22:50:49+00:00, 9
    2012-04-30T22:51:57+00:00, 1
    2012-04-30T22:54:50+00:00, 1
    2012-04-30T22:57:22+00:00, 0
    2012-04-30T22:58:38+00:00, 7
    2012-04-30T23:05:21+00:00, 1
    2012-04-30T23:08:56+00:00, 1
    

    I also tried to make sure I have the correct data types in my data frame by calling:

      print data_frame.dtypes
    

    and I get the following as out put

    ts     datetime64[ns]
    val             int64
    

    When I call group by on the data frame

    grouped = data_frame.groupby(lambda x: x.minute)
    

    I get the following error:

    grouped = data_frame.groupby(lambda x: x.minute)
    AttributeError: 'int' object has no attribute 'minute'