Python: what are the advantages of async over threads?

21,319

Solution 1

This article answers your questions.

TL;DR?

Threading in Python is inefficient because of the GIL (Global Interpreter Lock) which means that multiple threads cannot be run in parallel as you would expect on a multi-processor system. Plus you have to rely on the interpreter to switch between threads, this adds to the inefficiency.

asyc/asyncio allows concurrency within a single thread. This gives you, as the developer, much more fine grained control of the task switching and can give much better performance for concurrent I/O bound tasks than Python threading.

The 3rd approach that you don't mention is multiprocessing. This approach uses processes for concurrency and allows programs to make full use of hardware with multiple cores.

Solution 2

Asyncio is a wholly different world, and AFAIK it's the answer of python to node.js which does this things since the start. E.g. this official python doc about asyncio states:

Asynchronous programming is different than classical “sequential” programming

So you'd need to decide if you want to jump into that rabbit hole and learn this terminology. It probably only makes sense if you're faced with either network or disk related heavy tasks. If you are then e.g. this article claims that python 3's asyncio might be faster than node.js and close to the performance of Go.

That said: I've not used asyncio yet, so I cannot really commment on this, but I can comment on a few sentences from your question:

And all this async - await syntactic garbage all around the program is only to indicate that this method is able to yield control to message loop

As far as I can see you have an initial setup of asyncio, but then all the calls have less syntax around it than doing the same things with threads which you need to start() and join() and probably also to check with is_alive(), and to fetch the return value you need to set up a shared object first. So: no, asyncio just looks different but in the end the program will most probably look cleaner than with threads.

As I understand the only problem with this approach is that threads are expensive

Not really. Starting a new thread is very inexpensive and has AFAIK the same cost as starting a "native thread" in C or Java

looks like this is the key difference: usual stacks are operating-system stacks, they are expensive, coroutine stacks are just a python structures, they are much cheaper. Is this my understanding correct?

Not really. Nothing beats creating OS level threads, they are cheap. What asyncio is better at is that you need less thread switches. So if you have many concurrent threads waiting for network or disk then asyncio would probably speed up things.

Share:
21,319
lesnik
Author by

lesnik

Updated on November 04, 2020

Comments

  • lesnik
    lesnik over 3 years

    I've had a hard time trying to understand how and why async functionality works in python and I am still not sure I understand everything correctly (especially the 'why' part). Please correct me if I am wrong.

    The purpose of both async methods and threads is to make it possible to process several tasks concurrently.

    Threads approach looks simple and intuitive. If python program processes several tasks concurrently we have a thread (may be with sub-threads) for each task, the stack of each thread reflects the current stage of processing of corresponding task. Everything is straightforward, there are easy-to-use mechanisms to start a new thread and wait for results from it.

    As I understand the only problem with this approach is that threads are expensive.

    Another approach is using async coroutines. I can see several inconveniences with this approach. I'll name couple of them only. We have now two types of methods: usual methods and async methods. 90% of the time the only difference is that you need to remember that this method is async and do not forget to use await keyword when calling this method. And yes, you can't call async method from normal ones. And all this async - await syntactic garbage all around the program is only to indicate that this method is able to yield control to message loop.

    Threads approach is free of all these inconveniences. But async - await approach allows to process much more concurrent tasks than threads approach does. How is that possible?

    For each concurrent task we still have a call stack, only now it is a coroutine call stack. I am not quite sure, but looks like this is the key difference: usual stacks are operating-system stacks, they are expensive, coroutine stacks are just a python structures, they are much cheaper. Is this my understanding correct?

    If this is correct, wouldn't it be better to decouple python threads/call stacks from OS threads/call stacks to make python threads cheaper?

    Sorry if this question is stupid. I am sure there are some reasons why async-await approach was selected. Just want to understand these reasons.

    Update:

    For those who do not think this question is not good and too broad.

    Here is an article Unyielding - which starts with explanations why threads are bad and advertises async approach. Main thesis: threads are evil, it's too difficult to reason about a routine that may be executed from arbitrary number of threads concurrently.

    Thanks to Nathaniel J. Smith (author of python Trio library) who suggested this link.

    By the way, arguments in the article are not convincing for me, but still may be useful.

  • lesnik
    lesnik over 6 years
    No, it's not TL;DR! But I'll need some time to read the article. Thanks for the answer!
  • Dunes
    Dunes over 6 years
    I think you're a bit confused about green threads. Especially as you mention the kernel in the same sentence. CPython doesn't use green threads by default (this is what asyncio, gevent, PyPy, et al. are for). CPython uses pthreads or nt threads, dependent on OS. That only one thread is ever interpreting because of the GIL does not make the threads green (it just gives them a green-like property). Parallelism is possible in CPython, it's just the python byte code that cannot be parallelised. Any extension library that releases the GIL, like numpy, can be run in parallel.
  • hansaplast
    hansaplast over 6 years
    @Dunes, yes now I'm confused. I supposed that python does turn the pthreads into "green threads" in the way that it takes away the scheduler from the OS and playing the scheduler itself and hence introducing the GIL. Is that wrong?
  • hansaplast
    hansaplast over 6 years
    btw: removed the "green threads" paragraph for now, as Dunes is most probably correct that I mixed up things there
  • Dunes
    Dunes over 6 years
    It's weird. The scheduling is still managed by the OS. However, because of the GIL, any thread that is scheduled and doesn't hold the lock can't do anything (except for well behaved extension libraries). So both the OS and CPython have to agree on which thread to run (and the OS kind of picks at random). This is the major downside of the GIL. However, threading in CPython is okay for IO as the OS knows not to schedule any thread that is waiting on IO that hasn't finished. (CPython releases the GIL before it makes the system call to do the actual IO).
  • Luke Smith
    Luke Smith over 6 years
    @lesnik did this answer your question?
  • lesnik
    lesnik over 6 years
    GIL is a problem, but it doesn't explain why asyncio approach is better than threads. It explains why threads are not much-much better than asyncio. Yes, because of GIL (little simplification) only one python thread can be running at a given moment. Situation is the same with asyncio - only one task is running at a given moment. There is one more argument: the threads scheduler(or dispatcher or selector) is not good enough and can select threads which are blocked on IO operation. I am not sure this is correct (see Dunes comment to the second answer). The article doesn't answer my question :(.
  • lesnik
    lesnik over 6 years
    Thank you for answer, but I don't agree with your arguments (at least when you compare code complexity). Analog of x = await get_x() with threads is just x = get_x() - your thread just blocks until value for x is ready. Complexity of code which implements smthn like "start several tasks and wait until all of them are ready" is approximately the same with both approaches.