append rows to dataframe using foreach package
11,846
Solution 1
I think you need to read the docs for foreach
. Your code block should compute a single part, then you should use the .combine
option to say how to join them all together. Look at the examples in the help(foreach)
for more guidance. Its not a straight replacement for a for
loop.
For example:
> resultdf = foreach(i=1:10,.combine=rbind)%dopar%{data.frame(x=runif(4),i=i)}
> resultdf
x i
1 0.23794248 1
2 0.15536320 1
3 0.58609635 1
4 0.98780497 1
5 0.97806482 2
6 0.92440741 2
7 0.13416121 2
8 0.81598340 2
9 0.13834423 3
[etc]
Solution 2
You need to modify your 'foreach loop' such as:
newdf = foreach(ind=1:1000, .combine=rbind) %dopar%
{
testdf$X = sample(testdf$X,nrow(testdf), replace=FALSE)
fit = lm(X ~ Y, testdf)
data.frame(pc=ind, err=sum(residuals(fit)^2) )
}
Hope it helps!
Comments
-
ifreak over 1 year
I have a problem with appending values to a data frame using parallel processing.
I have a function that will do some calculation and return a dataframe, including these calculation is a random sampling.
so what i did is :
randomizex <- function(testdf) { foreach(ind=1:1000)%dopar% { testdf$X = sample(testdf$X,nrow(testdf), replace=FALSE) fit = lm(X ~ Y, testdf) newdf <- rbind(newdf, data.frame(pc=ind, err=sum(residuals(fit)^2) )) } return(newdf) } resdf = randomizex(mydf)
when i view the result of
resdf
, it's emptyif i replace
%dopar%
with%do%
the result is calculated correctly but it's too slow ..is there anyway to boost this a bit ??
-
ifreak about 11 yearsok, thank you for your answer, but how can i return the resulted df and use it somewhere else ?? it's just being printed to the stdou ..