Sliding window in R

17,042

Try this:

# form input data
library(zoo)
B <- c(0, 0, 0, 1, 0, 1, 1, 1, 0)

# calculate
k <- 3
rollapply(B, 2*k-1, function(x) max(rollmean(x, k)), partial = TRUE)

The last line returns:

[1] 0.0000000 0.3333333 0.3333333 0.6666667 0.6666667 1.0000000 1.0000000
[8] 1.0000000 0.6666667

If there are NA values you might want to try this:

k <- 3
B <- c(1, 0, 1, 0, NA, 1)
rollapply(B, 2*k-1, function(x) max(rollapply(x, k, mean, na.rm = TRUE)), partial = TRUE)

where the last line gives this:

[1] 0.6666667 0.6666667 0.6666667 0.5000000 0.5000000 0.5000000

Expanding it out these are formed as:

c(mean(B[1:3], na.rm = TRUE), ##
max(mean(B[1:3], na.rm = TRUE), mean(B[2:4], na.rm = TRUE)), ##
max(mean(B[1:3], na.rm = TRUE), mean(B[2:4], na.rm = TRUE), mean(B[3:5], na.rm = TRUE)),
max(mean(B[2:4], na.rm = TRUE), mean(B[3:5], na.rm = TRUE), mean(B[4:6], na.rm = TRUE)),
max(mean(B[3:5], na.rm = TRUE), mean(B[4:6], na.rm = TRUE)), ##
mean(B[4:6], na.rm = TRUE)) ##

If you don't want the k-1 components at each end (marked with ## above) drop partial = TRUE.

Share:
17,042
chas
Author by

chas

Updated on June 25, 2022

Comments

  • chas
    chas about 2 years

    I have a dataframe DF, with two columns A and B shown below:

    A                    B                  
    1                    0             
    3                    0               
    4                    0                   
    2                    1                    
    6                    0                    
    4                    1                     
    7                    1                 
    8                    1                     
    1                    0   
    

    A sliding window approach is performed as shown below. The mean is calulated for column B in a sliding window of size 3 sliding by 1 using: rollapply(DF$B, width=3,by=1). The mean values for each window are shown on the left side.

        A:         1    3    4    2    6    4    7    8    1                                          
        B:         0    0    0    1    0    1    1    1    0                                
                  [0    0    0]                                              0
                        [0    0    1]                                        0.33
                              [0    1    0]                                  0.33
                                    [1    0    1]                            0.66
                                          [0    1    1]                      0.66
                                                [1    1    1]                1
                                                     [1    1    0]           0.66
    output:        0   0.33 0.33 0.66   0.66    1     1    1   0.66
    

    Now, for each row/coordinate in column A, all windows containing the coordinate are considered and should retain the highest mean value which gives the results as shown in column 'output'.

    I need to obtain the output as shown above. The output should like:

    A                   B                  Output   
    1                   0                      0
    3                   0                      0.33
    4                   0                      0.33
    2                   1                      0.66
    6                   0                      0.66
    4                   1                      1
    7                   1                      1
    8                   1                      1
    1                   0                    0.66
    

    Any help in R?