How do I resample and interpolate timeseries data in R?

10,737

Solution 1

You can use approx or the related approxfun. If t is the vector consisting of the timepoints where your data was sampled and if y is the vector with the data then f <- approxfun(t,y) creates a function f that linearly interpolates the data points in between the time points.

Example:

# irregular time points at which data was sampled
t <- c(5,10,15,25,30,40,50)
# measurements 
y <- c(4.3,1.2,5.4,7.6,3.2,1.2,3.7)

f <- approxfun(t,y)

# get interpolated values for time points 5, 20, 35, 50
f(seq(from=5,to=50,by=15))
[1] 4.3 6.5 2.2 3.7

Solution 2

If you are looking for built-in downsampling (upsampling is not supported), you can also use the xts package.

data(sample_matrix)
samplexts <- as.xts(sample_matrix)
to.monthly(samplexts)
to.yearly(samplexts)
Share:
10,737
lindelof
Author by

lindelof

David's dayjob consists of working as a consultant for an IT company in Geneva. After dark he works on his main interests, which include home and building automation.

Updated on June 04, 2022

Comments

  • lindelof
    lindelof almost 2 years

    I have measurements that have been recorded approximately every 5 minutes:

    2012-07-09T05:30:01+02:00   1906.1  1069.2  1093.2  3   1071.0  1905.7  
    2012-07-09T05:35:02+02:00   1905.7  1069.2  1093.0  0   1071.5  1905.7  
    2012-07-09T05:40:02+02:00   1906.1  1068.7  1093.2  0   1069.4  1905.7  
    2012-07-09T05:45:02+02:00   1905.7  1068.4  1093.0  1   1069.6  1905.7  
    2012-07-09T05:50:02+02:00   1905.7  1068.2  1093.0  4   1073.3  1905.7  
    

    The first column is the data's timestamp. The remaining columns are the recorded data.

    I need to resample my data so that I have one row every 15 minutes, e.g. something like:

    2012-07-09T05:15:00 XX XX XX XX XX XX
    2012-07-09T05:30:00 XX XX XX XX XX XX
    ....
    

    (In addition, there may be gaps in the recorded data and I would like gaps of more than, say, one hour to be replaced with a row of NA values.)

    I can think of several ways to program this by hand, but is there built-in support for doing that kind of stuff in R? I've looked at the different libraries for dealing with timeseries data (zoo, chron etc) but couldn't find anything satisfactory.