How to use multiprocessing pool.map with multiple arguments
Solution 1
The answer to this is version- and situation-dependent. The most general answer for recent versions of Python (since 3.3) was first described below by J.F. Sebastian.1 It uses the Pool.starmap
method, which accepts a sequence of argument tuples. It then automatically unpacks the arguments from each tuple and passes them to the given function:
import multiprocessing
from itertools import product
def merge_names(a, b):
return '{} & {}'.format(a, b)
if __name__ == '__main__':
names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
with multiprocessing.Pool(processes=3) as pool:
results = pool.starmap(merge_names, product(names, repeat=2))
print(results)
# Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...
For earlier versions of Python, you'll need to write a helper function to unpack the arguments explicitly. If you want to use with
, you'll also need to write a wrapper to turn Pool
into a context manager. (Thanks to muon for pointing this out.)
import multiprocessing
from itertools import product
from contextlib import contextmanager
def merge_names(a, b):
return '{} & {}'.format(a, b)
def merge_names_unpack(args):
return merge_names(*args)
@contextmanager
def poolcontext(*args, **kwargs):
pool = multiprocessing.Pool(*args, **kwargs)
yield pool
pool.terminate()
if __name__ == '__main__':
names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
with poolcontext(processes=3) as pool:
results = pool.map(merge_names_unpack, product(names, repeat=2))
print(results)
# Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...
In simpler cases, with a fixed second argument, you can also use partial
, but only in Python 2.7+.
import multiprocessing
from functools import partial
from contextlib import contextmanager
@contextmanager
def poolcontext(*args, **kwargs):
pool = multiprocessing.Pool(*args, **kwargs)
yield pool
pool.terminate()
def merge_names(a, b):
return '{} & {}'.format(a, b)
if __name__ == '__main__':
names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
with poolcontext(processes=3) as pool:
results = pool.map(partial(merge_names, b='Sons'), names)
print(results)
# Output: ['Brown & Sons', 'Wilson & Sons', 'Bartlett & Sons', ...
1. Much of this was inspired by his answer, which should probably have been accepted instead. But since this one is stuck at the top, it seemed best to improve it for future readers.
Solution 2
is there a variant of pool.map which support multiple arguments?
Python 3.3 includes pool.starmap()
method:
#!/usr/bin/env python3
from functools import partial
from itertools import repeat
from multiprocessing import Pool, freeze_support
def func(a, b):
return a + b
def main():
a_args = [1,2,3]
second_arg = 1
with Pool() as pool:
L = pool.starmap(func, [(1, 1), (2, 1), (3, 1)])
M = pool.starmap(func, zip(a_args, repeat(second_arg)))
N = pool.map(partial(func, b=second_arg), a_args)
assert L == M == N
if __name__=="__main__":
freeze_support()
main()
For older versions:
#!/usr/bin/env python2
import itertools
from multiprocessing import Pool, freeze_support
def func(a, b):
print a, b
def func_star(a_b):
"""Convert `f([1,2])` to `f(1,2)` call."""
return func(*a_b)
def main():
pool = Pool()
a_args = [1,2,3]
second_arg = 1
pool.map(func_star, itertools.izip(a_args, itertools.repeat(second_arg)))
if __name__=="__main__":
freeze_support()
main()
Output
1 1
2 1
3 1
Notice how itertools.izip()
and itertools.repeat()
are used here.
Due to the bug mentioned by @unutbu you can't use functools.partial()
or similar capabilities on Python 2.6, so the simple wrapper function func_star()
should be defined explicitly. See also the workaround suggested by uptimebox
.
Solution 3
I think the below will be better:
def multi_run_wrapper(args):
return add(*args)
def add(x,y):
return x+y
if __name__ == "__main__":
from multiprocessing import Pool
pool = Pool(4)
results = pool.map(multi_run_wrapper,[(1,2),(2,3),(3,4)])
print results
Output
[3, 5, 7]
Solution 4
Using Python 3.3+ with pool.starmap():
from multiprocessing.dummy import Pool as ThreadPool
def write(i, x):
print(i, "---", x)
a = ["1","2","3"]
b = ["4","5","6"]
pool = ThreadPool(2)
pool.starmap(write, zip(a,b))
pool.close()
pool.join()
Result:
1 --- 4
2 --- 5
3 --- 6
You can also zip() more arguments if you like: zip(a,b,c,d,e)
In case you want to have a constant value passed as an argument:
import itertools
zip(itertools.repeat(constant), a)
In case your function should return something:
results = pool.starmap(write, zip(a,b))
This gives a List with the returned values.
Solution 5
How to take multiple arguments:
def f1(args):
a, b, c = args[0] , args[1] , args[2]
return a+b+c
if __name__ == "__main__":
import multiprocessing
pool = multiprocessing.Pool(4)
result1 = pool.map(f1, [ [1,2,3] ])
print(result1)
user642897
Updated on December 16, 2021Comments
-
user642897 over 2 years
In the Python
multiprocessing
library, is there a variant ofpool.map
which supports multiple arguments?import multiprocessing text = "test" def harvester(text, case): X = case[0] text + str(X) if __name__ == '__main__': pool = multiprocessing.Pool(processes=6) case = RAW_DATASET pool.map(harvester(text, case), case, 1) pool.close() pool.join()
-
senderle about 13 yearsTo my surprise, I could make neither
partial
norlambda
do this. I think it has to do with the strange way that functions are passed to the subprocesses (viapickle
). -
unutbu about 13 years@senderle: This is a bug in Python 2.6, but has been fixed as of 2.7: bugs.python.org/issue5228
-
Tung Nguyen almost 8 yearsJust simply replace
pool.map(harvester(text,case),case, 1)
by:pool.apply_async(harvester(text,case),case, 1)
-
Ricalsin about 7 years@Syrtis_Major , please don't edit OP questions which effectively skew answers that have been previously given. Adding
return
toharvester()
turned @senderie 's response into being inaccurate. That does not help future readers. -
H S Rathore over 4 yearsI would say easy solution would be to pack all the args in a tuple and unpack it in the executing func. I did this when I needed to send complicated multiple args to a func being executed by a pool of processes.
-
John Curry over 3 yearsMaybe there is some complexity I am missing for this particular use case but partial works for my similar use case and is very succint and easy to use. python.omics.wiki/multiprocessing_map/…
-
-
Björn Pollex about 13 yearsF.: You can unpack the argument tuple in the signature of
func_star
like this:def func_star((a, b))
. Of course, this only works for a fixed number of arguments, but if that is the only case he has, it is more readable. -
jfs about 13 years@Space_C0wb0y:
f((a,b))
syntax is deprecated and removed in py3k. And it is unnecessary here. -
xgdgsc over 10 yearsIt seems to me that RAW_DATASET in this case should be a global variable? While I want the partial_harvester change the value of case in every call of harvester(). How to achieve that?
-
Mike McKerns about 9 yearsThis is a near exact duplicate answer as the one from @J.F.Sebastian in 2011 (with 60+ votes).
-
user136036 about 9 yearsNo. First of all it removed lots of unnecessary stuff and clearly states it's for python 3.3+ and is intended for beginners that look for a simple and clean answer. As a beginner myself it took some time to figure it out that way (yes with JFSebastians posts) and this is why I wrote my post to help other beginners, because his post simply said "there is starmap" but did not explain it - this is what my post intends. So there is absolutely no reason to bash me with two downvotes.
-
dylam almost 9 yearsperhaps more pythonic:
func = lambda x: func(*x)
instead of defining a wrapper function -
jfs almost 9 years@dylam: read the last paragraph in the answer or try your suggestion on Python 2.6 (it fails)
-
WeizhongTu over 8 yearsThis is an easy way, but you need to change your original functions. What's more, some time recall others' functions which may can't be modified.
-
nehem over 8 yearsI will say this sticks to Python zen. There should be one and only one obvious way to do it. If by chance you are the author of the calling function, this you should use this method, for other cases we can use imotai's method.
-
nehem over 8 yearsMy choice is to use a tuple, And then immediately unwrap them as the first thing in the first line.
-
Emerson Xu almost 8 yearsThe most important thing here is assigning
=RAW_DATASET
default value tocase
. Otherwisepool.map
will confuse about the multiple arguments. -
Dave over 7 yearsI'm confused, what happened to the
text
variable in your example? Why isRAW_DATASET
seemingly passed twice. I think you might have a typo? -
zthomas.nc over 7 yearsSo ... the above doesn't work if you are calling a class function within a class (wants self passed as an argument?)
-
jfs over 7 years@zthomas.nc this question is about how to support multiple arguments for multiprocessing pool.map. If want to know how to call a method instead of a function in a different Python process via multiprocessing then ask a separate question (if all else fails, you could always create a global function that wraps the method call similar to
func_star()
above) -
Ahmed over 7 yearsEasiest solution. There is a small optimization; remove the wrapper function and unpack
args
directly inadd
, it works for any number of arguments:def add(args): (x,y) = args
-
Andre Holzner about 7 yearsyou could also use a
lambda
function instead of definingmulti_run_wrapper(..)
-
Andre Holzner about 7 yearshm... in fact, using a
lambda
does not work becausepool.map(..)
tries to pickle the given function -
muon over 6 yearsnot sure why using
with .. as ..
gives meAttributeError: __exit__
, but works fine if i just callpool = Pool();
then close manuallypool.close()
(python2.7) -
senderle over 6 years@muon, good catch. It appears
Pool
objects don't become context managers until Python 3.3. I've added a simple wrapper function that returns aPool
context manager. -
machen over 6 years@muon How to use call_back in pool.starmap
-
machen over 6 yearsDoes this starmap support generator function which yield infinite sequence?
-
machen over 6 yearsDoes this starmap support generator function which yield infinite sequence
-
senderle over 6 years@machen, it depends on what you mean. But I wouldn't recommend using infinite generators with multiprocessing unless they are paired with finite generators. For example, you could probably do something like
pool.starmap(twoarg_func, zip(finite, infinite))
. It's possible thatpool.imap
andpool.imap_unordered
could tolerate infinite generators but that still sounds like a pretty bad idea to me. -
jfs over 6 years@machen starmap supports generators (such as zip() above). It returns a list and therefore you shouldn't pass it an infinite generator (it will just consume all memory)
-
Fábio Dias about 6 yearsIndeed it does, still looking for a better way :(
-
Amir about 6 years@jfs This is a bit unrelated but I want to run a function that does not take any arguments in the background but I have some resource limitations and cannot run the function as many times that I want and want to queue the extra executions of the function. Do you have any idea on how I should do that? I have my question here. Could you please take a look at my question and see if you can give me some hints (or even better, an answer) on how I should do that?
-
Tedo Vrbanec over 5 yearsResults are not as expected: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] I would expect: [0,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,10,2,3,4,5,6,7,8,9,10,11, ...
-
Syrtis Major over 5 years@TedoVrbanec Results just should be [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]. If you want the later one, you may use
itertools.product
instead ofzip
. -
Константин Ван about 5 yearsI wish there were
starstarmap
. -
jfs about 5 years@КонстантинВан starstar is to accept an iterable of dicts with parameters?
-
Константин Ван about 5 years@jfs Right. Keyword arguments.
-
ScipioAfricanus about 5 yearsDoes the order of arguments in functions call?
-
Prav001 over 4 yearsNeat and elegant.
-
as - if over 4 yearswhy b=233. defeats the purpose of the question
-
Vivek Subramanian over 4 yearsHow do you use this if you want to store the result of
add
in a list? -
Michael Dorner over 4 years@Ahmed I like it how it is, because IMHO the method call should fail, whenever the number of parameter is not correct.
-
Scott about 4 yearsI want to note that this doesn't address the structure in the original question. [[1,2,3], [4,5,6]] would unpack with starmap to [pow(1,2,3), pow(4,5,6)], not [pow(1,4), pow(2,5), pow(3, 6)]. If you don't have good control over the inputs being passed to to your function, you may need to restructure them first.
-
Mike McKerns about 4 years@Scott: ah, I didn't notice that... over 5 years ago. I'll make a small update. Thanks.
-
toti almost 4 yearsI don't understand why I have to scroll all the way over here to find the best answer.
-
pauljohn32 almost 4 yearsShould zip input vectors. More understandable than transposing and array, don't you think?
-
Mike McKerns almost 4 yearsThe array transpose, while possibly less clear, should be less expensive.
-
Hammad over 2 yearsThis answer should literally have been at the top most.
-
Michael Silverstein over 2 yearsYou mean
c
instead ofcase
here, right?:res = pool.apply_async(harvester, (text, case, q = None))
-
Peter Mortensen over 2 yearsAn explanation would be in order. E.g., what is the idea/gist? Please respond by editing (changing) your answer, not here in comments (without "Edit:", "Update:", or similar - the answer should appear as if it was written today).
-
Peter Mortensen over 2 yearsStill, an explanation would be in order. E.g., what is the idea/gist? What languages features does it use and why? Please respond by editing (changing) your answer, not here in comments (without "Edit:", "Update:", or similar - the answer should appear as if it was written today).
-
Peter Mortensen over 2 yearsWhat do you mean by "a list lists of arguments" (seems incomprehensible)? Preferably, please respond by editing (changing) your answer, not here in comments (without "Edit:", "Update:", or similar - the answer should appear as if it was written today).
-
Sean William about 2 yearsplease add
pool.close()
andpool.join()
after getting results = pool.map(...), else this might possibly runs forever -
root-11 almost 2 years
starmap
was the answer I was looking for.