Multiprocessing python program inside Docker

18,206

Solution 1

From https://docs.docker.com/get-started - "Fundamentally, a container is nothing but a running process, with some added encapsulation features applied to it in order to keep it isolated from the host and from other containers."

Docker runs on a host machine. That host machine (or virtual machine) has a certain number of physical (or virtual) CPU's. The reason that multiprocessing.cpu_count() displays 8 in your case is because that is the number of CPU's your system has. Using docker options like --cpus or --cpuset-cpus doesn't change your machine's hardware, which is what cpu_count() is reporting.

On my current system:

# native
$ python -c 'import multiprocessing as mp; print(mp.cpu_count())'
12
# docker
$ docker run -it --rm --cpus 1 --cpuset-cpus 0 python python -c 'import multiprocessing as mp; print(mp.cpu_count())'
12

From https://docs.docker.com/config/containers/resource_constraints/#cpu - "By default, each container’s access to the host machine’s CPU cycles is unlimited." But you can limit containers with options like --cpus or --cpuset-cpus.

--cpus can be a floating point number up to the number of physical CPU's available. You can think of this number as a numerator in the fraction <--cpus arg>/<physical CPU's>. If you have 8 physical CPU's and you specify --cpus 4, what you're telling docker is to use no more than 50% (4/8) of your total CPU's. --cpus 1.5 would use 18.75% (1.5/8).

--cpuset-cpus actually does limit specifically which physical/virtual CPU's to use.

(And there are many other CPU-related options that are covered in docker's documentation.)

Here is a smaller code sample:

import logging
import multiprocessing
import sys

import psutil
from joblib.parallel import Parallel, delayed

def get_logger():
    logger = logging.getLogger()
    if not logger.hasHandlers():
        handler = logging.StreamHandler(sys.stdout)
        formatter = logging.Formatter("[%(process)d/%(processName)s] %(message)s")
        handler.setFormatter(formatter)
        handler.setLevel(logging.DEBUG)
        logger.addHandler(handler)
        logger.setLevel(logging.DEBUG)
    return logger

def fn1(n):
    get_logger().debug("fn1(%d); cpu# %d", n, psutil.Process().cpu_num())

if __name__ == "__main__":
    get_logger().debug("main")
    Parallel(n_jobs=multiprocessing.cpu_count())(delayed(fn1)(n) for n in range(1, 101))

Running this both natively and within docker will log lines such as:

[21/LokyProcess-2] fn1(81); cpu# 11
[28/LokyProcess-9] fn1(82); cpu# 6
[29/LokyProcess-10] fn1(83); cpu# 2
[31/LokyProcess-12] fn1(84); cpu# 0
[22/LokyProcess-3] fn1(85); cpu# 3
[23/LokyProcess-4] fn1(86); cpu# 1
[20/LokyProcess-1] fn1(87); cpu# 7
[25/LokyProcess-6] fn1(88); cpu# 3
[27/LokyProcess-8] fn1(89); cpu# 4
[21/LokyProcess-2] fn1(90); cpu# 9
[28/LokyProcess-9] fn1(91); cpu# 10
[26/LokyProcess-7] fn1(92); cpu# 11
[22/LokyProcess-3] fn1(95); cpu# 9
[29/LokyProcess-10] fn1(93); cpu# 2
[24/LokyProcess-5] fn1(94); cpu# 10
[23/LokyProcess-4] fn1(96); cpu# 1
[20/LokyProcess-1] fn1(97); cpu# 9
[23/LokyProcess-4] fn1(98); cpu# 1
[27/LokyProcess-8] fn1(99); cpu# 4
[21/LokyProcess-2] fn1(100); cpu# 5

Notice that all 12 CPU's are in use on my system. Notice that

the same physical CPU is used by multiple processes (cpu#3 by process #'s 22 & 25)
one individual process can use multiple CPU's (process #21 uses CPU #'s 11 & 9)

Running the same program with docker run --cpus 1 ... will still result in all 12 CPU's being used by all 12 processes started, just as if the --cpus argument wasn't present. It just limits percentage of total CPU time docker is allowed to use.

Running the same program with docker run --cpusets-cpus 0-1 ... will result in only 2 physical CPU's being used by all 12 processes started:

[11/LokyProcess-2] fn1(35); cpu# 0
[11/LokyProcess-2] fn1(36); cpu# 0
[12/LokyProcess-3] fn1(37); cpu# 1
[11/LokyProcess-2] fn1(38); cpu# 0
[15/LokyProcess-6] fn1(39); cpu# 1
[17/LokyProcess-8] fn1(40); cpu# 0
[11/LokyProcess-2] fn1(41); cpu# 0
[10/LokyProcess-1] fn1(42); cpu# 1
[11/LokyProcess-2] fn1(43); cpu# 1
[13/LokyProcess-4] fn1(44); cpu# 1
[12/LokyProcess-3] fn1(45); cpu# 0
[12/LokyProcess-3] fn1(46); cpu# 1

To answer the statement "they always take only one physical CPU"-- this is only true if the --cpusets-cpus arg is exactly/only 1 CPU.

(As a side note-- the reason for logging being set up the way it is in the example is becuase of an open bug in joblib.)

Solution 2

multiprocessing.cpu_count() gives 2 on my machine without passing --cpu option

headover to https://docs.docker.com/engine/admin/resource_constraints/#cpu for more information about docker container resources

Solution 3

Try creating machine from scratch (replace numerical values with desired ones):

docker-machine rm default
docker-machine create -d virtualbox --virtualbox-cpu-count=8 --virtualbox-memory=8192 --virtualbox-disk-size=10000 default

This is just to be on a safe side. And now important part:

Specify cores number before running your image. Following command will use 8 cores.

docker run -it --cpuset-cpus="0-7" your_image_name

And check in docker, if you succeeded not only in python with

nproc

Good luck and let us know how it went 😊 !

Solution 4

You can test that the multiprocessor is working fine by doing the following commands:

$ docker run -it --rm ubuntu:20.04
root@somehash:/# apt update && apt install stress
root@somehash:/# stress --cpu 8 # 8 if you have 8 cores

If you have multiple cores, you can test in another terminal the command htop or top and you should see all the cores running. If you used htop you should see something as follows.

If you are at this step. Then everything is working fine. Furthermore, as I run the script you provided I see my processors being used as they should, you can the the image below. (I also add the process to show it, I run your script inside an ipython terminal. I also changed from sklearn.externals.joblib.parallel import Parallel, delayed to from joblib.parallel import Parallel, delayed because it was not working for me otherwise).

I hope the information provided helps. For other clues, you might like to check your version of docker.

Solution 5

''' Distributed load among several Docker containers using Python multiprocessing capabilities '''

import random
import time
import subprocess
import queue
from multiprocessing import Pool, Queue, Lock

LOCK = Lock()
TEST_QUEUE = Queue()


class TestWorker(object):
    ''' This Class is executed by each container '''

    @staticmethod
    def run_test(container_id, value):
        ''' Operation to be executed for each container '''

        cmd = ['docker exec -it {0} echo "I am container {0}!, this is message: {1}"' \
                .format(container_id, value)]
        process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
        for line in process.stdout:
            print(line.decode('utf-8')[:-2])
        process.wait()

    @staticmethod
    def container(container_id):
        ''' Here we get a value from the shared queue '''

        while not TEST_QUEUE.empty():
            LOCK.acquire()
            try:
                value = TEST_QUEUE.get(block=False)
                time.sleep(0.5)
            except queue.Empty:
                print("Queue empty ):")
                return
            print("\nProcessing: {0}\n".format(value))
            LOCK.release()
            TestWorker.run_test(container_id, value)

def master():
    ''' Main controller to set containers and test values '''

    qty = input("How many containers you want to deploy: ")
    msg_val = input("How many random values you want to send among this containers: ")

    print("\nGenerating test messages...\n")
    for _ in range(int(msg_val)):
        item = random.randint(1000, 9999)
        TEST_QUEUE.put(item)
    ids = []
    for _ in range(int(qty)):
        container_id = subprocess.run(["docker", "run", "-it", "-d", "centos:7"], \
                stdout=subprocess.PIPE)
        container_id = container_id.stdout.decode('utf-8')[:-1]
        ids.append(container_id)
    pool = Pool(int(qty))
    pool.map(TestWorker.container, ids)
    pool.close()

master()

View more solutions

18,206

hanego

I am a curious developer and entrepreneur and I love designing new piece of software, currently a Loopback 4 + Angular 6 stack! Also love doing data science with Pandas and Co!

Updated on October 22, 2022

Comments

hanego over 1 year
I am trying to test multiprocessing for python inside a docker container but even, if the processes are created successfully (I have 8 CPUs and 8 processes are created), they always take only one physical CPU. Here is my code:
```
from sklearn.externals.joblib.parallel import Parallel, delayed
import multiprocessing
import pandas
import numpy
from scipy.stats import linregress
import random
import logging

def applyParallel(dfGrouped, func):
    retLst = Parallel(n_jobs=multiprocessing.cpu_count())(delayed(func)(group) for name, group in dfGrouped)
    return pandas.concat(retLst)

def compute_regression(df):
    result = {}

    (slope,intercept,rvalue,pvalue,stderr) = linregress(df.date,df.value)
    result["slope"] = [slope]
    result["intercept"] = [intercept]

    return pandas.DataFrame(result)

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    logging.info("start")
    random_list = []
    for i in range(1,10000):
        for j in range(1,100):
            random_list.append({"id":i,"date":j,"value":random.random()})

    df = pandas.DataFrame(random_list)

    df = applyParallel(df.groupby('id'), compute_regression)

    logging.info("end")
```
I tried multiple docker options when I launch like --cpus or --cpuset but it is always using only 1 physical CPUs. Is it an issue in Docker, python, the OS? Docker version is 1.13.1

The result of the cpu_count():
```
>>> import multiprocessing
>>> multiprocessing.cpu_count()
8
```
During the run, here is a top. We can see the main process and the 8 child processes but I find the percentages weird.

And then, if I change to 4 processes, total amount of CPU used is always the same:
- Graham Dumpleton over 6 years
  
  If you are running Docker on a Mac or Windows, it runs inside of a VM. You need to configure Docker as a whole to allocate more CPUs to that VM. Options to docker run don't override that, you can't only use up to as many as VM is allowed to use.
- hanego over 6 years
  
  It s actually running inside Linux :(
- hansaplast over 6 years
  
  can you do a print(multiprocessing.cpu_count()) and add the result in your question?
- hanego over 6 years
  
  @hansaplast I added the screenshot
- hansaplast over 6 years
  
  is that from within docker?
- hanego over 6 years
  
  @hansaplast no, from the root machine
- hansaplast over 6 years
  
  can you do that from within docker? Because that's exactly the question: "how many cores does docker get to use from the host system?"
- DCS almost 6 years
  
  @angelwally Have you solved the problem in the end? I'm experiencing the same same thing with the Pathos library.
- hanego almost 6 years
  
  @DCS No in the end i change my architecture to have multiple dockers :/
- rok over 3 years
  
  @hanego can you share the Dockerfile, please? And result of running multiprocessing.cpu_count() inside the container?
- hanego over 3 years
  
  Sorry I don't have it anymore, I opened this issue almost 3 years ago :/