"un-register" a doParallel cluster

36,553

Solution 1

The only official way to "unregister" a foreach backend is to register the sequential backend:

registerDoSEQ()

This makes sense to me because you're supposed to declare which backend to use, so I didn't see any point in providing a way to "undeclare" which backend to use. Instead, you declare that you want to use the sequential backend, which is the default.

I originally considered including an "unregister" function, but since I couldn't convince myself that it was useful, I decided to leave it out since it's much easier to add a function than to remove one.

That being said, I think all you need to do is to remove all of the variables from foreach:::.foreachGlobals which is where foreach keeps all of its state:

unregister <- function() {
  env <- foreach:::.foreachGlobals
  rm(list=ls(name=env), pos=env)
}

After calling this function, any parallel backend will be deregistered and the warning will be issued again if %dopar% is called.

Solution 2

    cl <- makeCluster(2)
    registerDoParallel(cl)
    on.exit(stopCluster(cl))

This worked fine for me.

Share:
36,553
Zach
Author by

Zach

Interested in Data Science?? I currently teach 2 online classes through DataCamp. Check them out to learn more: Advanced Deep Learning with Keras in Python The Machine Learning Toolbox - R

Updated on January 20, 2020

Comments

  • Zach
    Zach over 4 years

    If I run foreach... %dopar% without registering a cluster, foreach raises a warning, and executes the code sequentially:

    library("doParallel")
    foreach(i=1:3) %dopar%
      sqrt(i)
    

    Yields:

    Warning message:
    executing %dopar% sequentially: no parallel backend registered 
    

    However, if I run this same code after starting, registering, and stopping a cluster, it fails:

    cl <- makeCluster(2)
    registerDoParallel(cl)
    stopCluster(cl)
    rm(cl)
    foreach(i=1:3) %dopar%
      sqrt(i)
    

    Yields:

    Error in summary.connection(connection) : invalid connection
    

    Is there an opposite of registerDoParallel() that cleans up the cluster registration? Or am I stuck with the ghost of the old cluster until I re-start my R session?

    /edit: some googling reveals the bumphunter:::foreachCleanup() function in the bumphunter Biocondoctor package:

    function () 
    {
        if (exists(".revoDoParCluster", where = doParallel:::.options)) {
            if (!is.null(doParallel:::.options$.revoDoParCluster)) 
                stopCluster(doParallel:::.options$.revoDoParCluster)
            remove(".revoDoParCluster", envir = doParallel:::.options)
        }
    }
    <environment: namespace:bumphunter>
    

    However, this function doesn't seem to fix the problem.

    library(bumphunter)
    cl <- makeCluster(2)
    registerDoParallel(cl)
    stopCluster(cl)
    rm(cl)
    bumphunter:::foreachCleanup()
    foreach(i=1:3) %dopar%
      sqrt(i)
    

    Where does foreach keep the information on the registered cluster?

  • Hong Ooi
    Hong Ooi almost 10 years
    Maybe adding an alias for registerDoSeq -> unregister would be the way to go?
  • Zach
    Zach almost 10 years
    Perfect, that's what I was looking for. Thank you.
  • Zach
    Zach over 8 years
    If you run stopCluster(cl) and then try to run %dopar%, it will fail. You need to run registerDoSEQ() first.
  • Smit
    Smit over 8 years
    I guess on.exit() takes care of it.
  • Smit
    Smit over 8 years
    on.exit() executes stopClusters(cl) only after %dopar% is done using the parallel backend registered by registerDoParallel(cl)
  • Zach
    Zach over 8 years
    on.exit() basically means stopCluster never gets called during your session. I needed a way stop the cluster and then keep running %dopar% code
  • CooperBuckeye05
    CooperBuckeye05 over 2 years
    This is a great solution as I have been dealing with this issue for nearly two years without finding a solution. I do have a question however, if I register this function, do I then run "unregister" in a single line by itself before I run the foreach loop, or do I put this in or even after the loop? Thanks!