Python multiprocessing.Pool: AttributeError

86,880

Solution 1

Error 1:

AttributeError: Can't pickle local object 'SomeClass.some_method..single'

You solved this error yourself by moving the nested target-function single() out to the top-level.

Background:

Pool needs to pickle (serialize) everything it sends to its worker-processes (IPC). Pickling actually only saves the name of a function and unpickling requires re-importing the function by name. For that to work, the function needs to be defined at the top-level, nested functions won't be importable by the child and already trying to pickle them raises an exception (more).


Error 2:

AttributeError: Can't get attribute 'single' on module 'main' from '.../test.py'

You are starting the pool before you define your function and classes, that way the child processes cannot inherit any code. Move your pool start up to the bottom and protect (why?) it with if __name__ == '__main__':

import multiprocessing

class OtherClass:
  def run(self, sentence, graph):
    return False


def single(params):
    other = OtherClass()
    sentences, graph = params
    return [other.run(sentence, graph) for sentence in sentences]

class SomeClass:
   def __init__(self):
       self.sentences = [["Some string"]]
       self.graphs = ["string"]

   def some_method(self):
      return list(pool.map(single, zip(self.sentences, self.graphs)))

if __name__ == '__main__':  # <- prevent RuntimeError for 'spawn'
    # and 'forkserver' start_methods
    with multiprocessing.Pool(multiprocessing.cpu_count() - 1) as pool:
        print(SomeClass().some_method())

Appendix

...I would like to spread the work over all of my cores.

Potentially helpful background on how multiprocessing.Pool is chunking work:

Python multiprocessing: understanding logic behind chunksize

Solution 2

I accidentally discovered a very nasty solution. It works, as long as you use a def statement. If you declare the function, that you want to use in Pool.map with the global keyword at the beginning of the function that solves it. But I would not rely on this in serious applications 😉

import multiprocessing
pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)

class OtherClass:
  def run(sentence, graph):
    return False

class SomeClass:
  def __init__(self):
    self.sentences = [["Some string"]]
    self.graphs = ["string"]

  def some_method(self):
      global single  # This is ugly, but does the trick XD

      other = OtherClass()

      def single(params):
          sentences, graph = params
          return [other.run(sentence, graph) for sentence in sentences]

      return list(pool.map(single, zip(self.sentences, self.graphs)))


SomeClass().some_method()
Share:
86,880
Amit
Author by

Amit

Updated on December 12, 2020

Comments

  • Amit
    Amit over 3 years

    I have a method inside a class that needs to do a lot of work in a loop, and I would like to spread the work over all of my cores.

    I wrote the following code, which works if I use normal map(), but with pool.map() returns an error.

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
    
    class OtherClass:
      def run(sentence, graph):
        return False
    
    class SomeClass:
      def __init__(self):
        self.sentences = [["Some string"]]
        self.graphs = ["string"]
    
      def some_method(self):
          other = OtherClass()
    
          def single(params):
              sentences, graph = params
              return [other.run(sentence, graph) for sentence in sentences]
    
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    
    SomeClass().some_method()
    

    Error 1:

    AttributeError: Can't pickle local object 'SomeClass.some_method..single'

    Why can't it pickle single()? I even tried to move single() to the global module scope (not inside the class - makes it independent of the context):

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
    
    class OtherClass:
      def run(sentence, graph):
        return False
    
    
    def single(params):
        other = OtherClass()
        sentences, graph = params
        return [other.run(sentence, graph) for sentence in sentences]
    
    class SomeClass:
      def __init__(self):
        self.sentences = [["Some string"]]
        self.graphs = ["string"]
    
      def some_method(self):
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    
    SomeClass().some_method()
    

    and I get the following ...

    Error 2:

    AttributeError: Can't get attribute 'single' on module 'main' from '.../test.py'

  • Amit
    Amit over 5 years
    Perfect. Thank you, I wasn't aware Pool is placement dependent
  • Nimrod Morag
    Nimrod Morag over 2 years
    dear god... is it safe? or will I get some race condition on single across different instances of the class?
  • Marcell Pigniczki
    Marcell Pigniczki over 2 years
    Of course not. Don't even think about using it in production code... xD
  • Jack Westmore
    Jack Westmore about 2 years
    I believe that some of this also applies to multiprocessing.Process.