Repeat rows of a data.frame
Solution 1
df <- data.frame(a = 1:2, b = letters[1:2])
df[rep(seq_len(nrow(df)), each = 2), ]
Solution 2
A clean dplyr
solution, taken from here
library(dplyr)
df <- tibble(x = 1:2, y = c("a", "b"))
df %>% slice(rep(1:n(), each = 2))
Solution 3
There is a lovely vectorized solution that repeats only certain rows n-times each, possible for example by adding an ntimes
column to your data frame:
A B C ntimes
1 j i 100 2
2 K P 101 4
3 Z Z 102 1
Method:
df <- data.frame(A=c("j","K","Z"), B=c("i","P","Z"), C=c(100,101,102), ntimes=c(2,4,1))
df <- as.data.frame(lapply(df, rep, df$ntimes))
Result:
A B C ntimes
1 Z Z 102 1
2 j i 100 2
3 j i 100 2
4 K P 101 4
5 K P 101 4
6 K P 101 4
7 K P 101 4
This is very similar to Josh O'Brien and Mark Miller's method:
df[rep(seq_len(nrow(df)), df$ntimes),]
However, that method appears quite a bit slower:
df <- data.frame(A=c("j","K","Z"), B=c("i","P","Z"), C=c(100,101,102), ntimes=c(2000,3000,4000))
microbenchmark::microbenchmark(
df[rep(seq_len(nrow(df)), df$ntimes),],
as.data.frame(lapply(df, rep, df$ntimes)),
times = 10
)
Result:
Unit: microseconds
expr min lq mean median uq max neval
df[rep(seq_len(nrow(df)), df$ntimes), ] 3563.113 3586.873 3683.7790 3613.702 3657.063 4326.757 10
as.data.frame(lapply(df, rep, df$ntimes)) 625.552 654.638 676.4067 668.094 681.929 799.893 10
Solution 4
If you can repeat the whole thing, or subset it first then repeat that, then this similar question may be helpful. Once again:
library(mefa)
rep(mtcars,10)
or simply
mefa:::rep.data.frame(mtcars)
Solution 5
Adding to what @dardisco mentioned about mefa::rep.data.frame()
, it's very flexible.
You can either repeat each row N times:
rep(df, each=N)
or repeat the entire dataframe N times (think: like when you recycle a vectorized argument)
rep(df, times=N)
Two thumbs up for mefa
! I had never heard of it until now and I had to write manual code to do this.
Related videos on Youtube
Stefan
Updated on July 17, 2022Comments
-
Stefan almost 2 years
I want to repeat the rows of a data.frame, each
N
times. The result should be a newdata.frame
(withnrow(new.df) == nrow(old.df) * N
) keeping the data types of the columns.Example for N = 2:
A B C A B C 1 j i 100 1 j i 100 --> 2 j i 100 2 K P 101 3 K P 101 4 K P 101
So, each row is repeated 2 times and characters remain characters, factors remain factors, numerics remain numerics, ...
My first attempt used apply:
apply(old.df, 2, function(co) rep(co, each = N))
, but this one transforms my values to characters and I get:A B C [1,] "j" "i" "100" [2,] "j" "i" "100" [3,] "K" "P" "101" [4,] "K" "P" "101"
-
Mark Miller about 10 yearsYou can use
n.times <- c(2,4) ; df[rep(seq_len(nrow(df)), n.times),]
if you want to vary the number of times each line is repeated. -
smci about 10 yearsAha! Another brilliant R function hidden deep inside an obcure specialist package with a totally unrelated name. I love this language!
-
Dan Villarreal about 4 yearsThis is the preferable solution imo because it works cleanly in a pipe.
-
TCS almost 3 yearsI think that this is the most versatile solution, as it allows you to assign different number of replications per line! I am curious, is there a way to do this in tidyverse?