multiprocessing.Pool in jupyter notebook works on linux but not windows

22,724

I would post this as a comment since I don't have a full answer, but I'll amend as I figure out what is going on.

from multiprocessing import Pool

def f(x):
    return x**2

if __name__ == '__main__':
    pool = Pool(4)
    for res in pool.map(f,range(20)):
        print(res)

This works. I believe the answer to this question is here. In short, the subprocesses do not know they are subprocesses and are attempting to run the main script recursively.

This is the error I am given, which gives us the same solution:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Share:
22,724
user1999728
Author by

user1999728

Updated on July 22, 2022

Comments

  • user1999728
    user1999728 almost 2 years

    I'm trying to run a few independent computations (though reading from the same data). My code works when I run it on Ubuntu, but not on Windows (windows server 2012 R2), where I get the error:

    'module' object has no attribute ...

    when I try to use multiprocessing.Pool (it appears in the kernel console, not as output in the notebook itself)

    (And I've already made the mistake of defining the function AFTER creating the pool, and I've also corrected it, that's not the problem).

    This happens even on the simplest of examples:

    from multiprocessing import Pool
    def f(x):
        return x**2
    pool = Pool(4)
    for res in pool.map(f,range(20)):
        print res
    

    I know that it needs to be able to import the module (and I have no idea how this works when working in the notebook), and I've heard of IPython.Parallel, but I have been unable to find any documentation or examples.

    Any solutions/alternatives would be most welcome.