Multicore and multithread on Ipython Notebook

38,927

You can use multiprocessing to allow Python to use multiple cores. Just one, big caveat: all the data you pass between Python sessions has to be picklable or passed via inheritance, and a new Python instance is spawned on Windows, while on Unix systems it can be forked over. This has notabled performance implications on a Windows system.

A basic example using multiprocessing is as follows from "Python Module of the Week":

import multiprocessing

def worker():
    """worker function"""
    print 'Worker'
    return

if __name__ == '__main__':
    jobs = []
    for i in range(5):
        p = multiprocessing.Process(target=worker)
        jobs.append(p)
        p.start()

When executed, it outputs:

Worker
Worker
Worker
Worker
Worker

Multiprocessing allows you to do independent calculations on different cores, allowing CPU-bound tasks with little overhead to execute much more rapidly than a traditional process.

You should also realize that threading in Python does not improve performance. It exists for convenience (such as maintaining the responsiveness of a GUI during long calculations). The reason for this is these are not native threads due to Python's Global Interpreter Lock, or GIL.

Update Feburary 2018

This is still very much applicable, and will be for the foreseeable future. The Cpython implementation uses the following definition for reference counting:

typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

Notably, this is not thread-safe, so a global-interpreter lock must be implemented to allow only one thread of execution with Python objects to avoid data races leading to memory issues.

There are numerous tools to try to side-step the global interpreter lock, in addition to multiprocessing (which requires a complete copy of the interpreter on Windows, rather than a fork, making it very slow and unamenable to improving performance).

Cython

Your simplest solution is Cython. Simply cdef a function, without any internal objects, and release the GIL with the with nogil keyword.

A simple example taken from the documentation, which shows you how to release, and temporarily re-enable the GIL:

from cython.parallel import prange

cdef int func(Py_ssize_t n):
    cdef Py_ssize_t i

    for i in prange(n, nogil=True):
        if i == 8:
            with gil:
                raise Exception()
        elif i == 4:
            break
        elif i == 2:
            return i

Using a Different Interpreter

CPython has a GI, while Jython and IronPython do not. Be careful, as numerous C-libraries for high-performance computing may not work with IronPython or Jython (SciPy flirted with IronPython support, but dropped it long ago, and it will not work on a modern Python version).

Using MPI4Py

MPI, or Message Passing Interface, is a high-performance interface for languages like C and C++. It allows efficient parallel computations, and MPI4Py creates bindings for MPI for Python. For efficiency, you should only use MPI4Py with NumPy arrays.

An example from their documentation is:

from mpi4py import MPI
import numpy

def matvec(comm, A, x):
    m = A.shape[0] # local rows
    p = comm.Get_size()
    xg = numpy.zeros(m*p, dtype='d')
    comm.Allgather([x,  MPI.DOUBLE],
                   [xg, MPI.DOUBLE])
    y = numpy.dot(A, xg)
    return y
Share:
38,927
Jack_The_Ripper
Author by

Jack_The_Ripper

Updated on July 25, 2022

Comments

  • Jack_The_Ripper
    Jack_The_Ripper almost 2 years

    I am currently using the threading function in python and got the following:

    In [1]:
    import threading
    threading.activeCount()
    
    Out[1]:
    4
    

    Now on my terminal, I use lscpu and learned there are 2 threads per core and I have access to 4 cores:

    kitty@FelineFortress:~$ lscpu
    Architecture:          x86_64
    CPU op-mode(s):        32-bit, 64-bit
    Byte Order:            Little Endian
    CPU(s):                8
    On-line CPU(s) list:   0-7
    Thread(s) per core:    2
    Core(s) per socket:    4
    Socket(s):             1
    NUMA node(s):          1
    Vendor ID:             GenuineIntel
    CPU family:            6
    Model:                 60
    Stepping:              3
    CPU MHz:               800.000
    BogoMIPS:              5786.45
    Virtualization:        VT-x
    L1d cache:             32K
    L1i cache:             32K
    L2 cache:              256K
    L3 cache:              8192K
    NUMA node0 CPU(s):     0-7
    

    Hence, I should have a lot more than 4 threads to access. Is there a python function I can use to increase the number of cores I am using (with example) to get more than 4 threads? Or even something to type on the terminal when launching ipython notebook like below:

    ipython notebook n_cores=3
    
  • Jack_The_Ripper
    Jack_The_Ripper about 8 years
    Is there a way for me to access multiple threads for each core?
  • Alex Huszagh
    Alex Huszagh about 8 years
    @Jack_The_Ripper, each multiprocessing instance can initiate threads, so yes, technically. But remember, these aren't native threads and so the threading guidelines aren't "fixed", they're guidelines. But since each thread does not improve performance, unless you have a very good reason, the practical answer is no. Using multiprocessing to boost performance by doing independent tasks separately, using multithreading to handle long, non-responsive tasks.
  • Admin
    Admin about 6 years
    @AlexanderHuszagh as this is almost 2 years old, can I kindly ask you to check if this is still applicable. Thank you stackoverflow.com/questions/48932854/…
  • Alex Huszagh
    Alex Huszagh about 6 years
    @Victor, it still is. Python's threading and the GIL have made numerous improvements, but fundamentally, only one thread can execute at a time since reference counting is still not thread-safe to my knowledge. Your best bet, if you are willing to use Cython, is with the no gil statement. There are also many other tools. I'll update my statement.
  • Alex Huszagh
    Alex Huszagh about 6 years
    @Victor Added numerous other tools to side-step the GIL, and an explanation of why the GIL is needed, without writing custom C extensions. Hopefully that helps.
  • Gavriel
    Gavriel over 4 years
    @AlexanderHuszagh how can I pass parameters to the worker? Let's say in your example I'd like to pass i and print: "Worker {i}"
  • Alex Huszagh
    Alex Huszagh over 4 years
    @Gavriel The docs still has an example for that here docs.python.org/2/library/…, and it's using the args keyword, which accepts a tuple of arguments to pass to the function. Please note if using a mutable type, the data will be a copy and so any changes you make will not be reflected in the type. If you need to share data, use SyncManager's collections: docs.python.org/3.5/library/…