Repeat vector to fill down column in data frame

15,379

Solution 1

If the vector can be evenly recycled, into the data.frame, you do not get and error or a warning:

df <- data.frame(x = 1:10)
df$z <- 1:5

This may be what you were experiencing before.

You can get your vector to fit as you mention with rep_len:

df$y <- rep_len(1:3, length.out=10)

This results in

df
    x z y
1   1 1 1
2   2 2 2
3   3 3 3
4   4 4 1
5   5 5 2
6   6 1 3
7   7 2 1
8   8 3 2
9   9 4 3
10 10 5 1

Note that in place of rep_len, you could use the more common rep function:

df$y <- rep(1:3,len=10)

From the help file for rep:

rep.int and rep_len are faster simplified versions for two common cases. They are not generic.

Solution 2

If the total number of rows is a multiple of the length of your new vector, it works fine. When it is not, it does not work everywhere. In particular, probably you have used this type of recycling with matrices:

data.frame(1:6, 1:3, 1:4) # not a multiply
# Error in data.frame(1:6, 1:3, 1:4) : 
#   arguments imply differing number of rows: 6, 3, 4
data.frame(1:6, 1:3) # a multiple
#   X1.6 X1.3
# 1    1    1
# 2    2    2
# 3    3    3
# 4    4    1
# 5    5    2
# 6    6    3
cbind(1:6, 1:3, 1:4) # works even with not a multiple
#      [,1] [,2] [,3]
# [1,]    1    1    1
# [2,]    2    2    2
# [3,]    3    3    3
# [4,]    4    1    4
# [5,]    5    2    1
# [6,]    6    3    2
# Warning message:
# In cbind(1:6, 1:3, 1:4) :
#   number of rows of result is not a multiple of vector length (arg 3)
Share:
15,379
Morten Nielsen
Author by

Morten Nielsen

Updated on June 26, 2022

Comments

  • Morten Nielsen
    Morten Nielsen almost 2 years

    Seems like this very simple maneuver used to work for me, and now it simply doesn't. A dummy version of the problem:

    df <- data.frame(x = 1:5) # create simple dataframe
    df
      x
    1 1
    2 2
    3 3
    4 4
    5 5
    
    df$y <- c(1:5) # adding a new column with a vector of the exact same length. Works out like it should
    df
     x y
    1 1 1
    2 2 2
    3 3 3
    4 4 4
    5 5 5
    
    df$z <- c(1:4) # trying to add a new colum, this time with a vector with less elements than there are rows in the dataframe.
    
    Error in `$<-.data.frame`(`*tmp*`, "z", value = 1:4) : 
      replacement has 4 rows, data has 5
    

    I was expecting this to work with the following result:

     x y z
    1 1 1 1
    2 2 2 2
    3 3 3 3
    4 4 4 4
    5 5 5 1
    

    I.e. the shorter vector should just start repeating itself automatically. I'm pretty certain this used to work for me (it's in a script that I've been running a hundred times before without problems). Now I can't even get the above dummy example to work like I want to. What am I missing?