Python 3.2 - GIL - good/bad?

python multithreading locking interpreter

12,746

Solution 1

The best explanation I've seen as to why the GIL sucks is here:

http://www.dabeaz.com/python/GIL.pdf

And the same guy has a presentation on the new GIL here:

http://www.dabeaz.com/python/NewGIL.pdf

If that's all that's been done it still sucks - just not as bad. Multiple threads will behave better. Multi-core will still do nothing for you with a single python app.

Solution 2

Is having a GIL good or bad? (and why).

Neither -- or both. It's necessary for thread synchronization.

Is the new GIL better? If so, how?

Have you run any benchmarks? If not, then perhaps you should (1) run a benchmark, (2) post the benchmark in the question and (3) ask specific questions about the benchmark results.

Discussing the GIL in vague, handwaving ways is largely a waste of time.

Discussing the GIL in the specific context of your benchmark, however, can lead to a solution to your performance problem.

Question though, why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?

Read this: http://perldoc.perl.org/perlthrtut.html

First, Perl didn't support threads at all. Older Perl interpreters had a buggy module that didn't work correctly.

Second, the newer Perl interpreter has this feature.

The biggest difference between Perl ithreads and the old 5.005 style threading, or for that matter, to most other threading systems out there, is that by default, no data is shared. When a new Perl thread is created, all the data associated with the current thread is copied to the new thread, and is subsequently private to that new thread!

Since the Perl (only specific data is shared) model is different from Python's (all data is shared) model, copying the Perl interpreter would fundamentally break Python's threads. The Perl thread model is fundamentally different.

Solution 3

Is the new GIL better? If so, how?

Well, it at least replaces op-count switching to proper time-count. This does not increase overall performance (and could even hurt it due to more often switching), but this makes threads more responsive and eliminates cases when ALL threads get locked if one of them uses computation-heavy single op-code (like call to external function which does not release GIL).

why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?

GIL is complex issue, it should not be viewed as Ultimate Evil. It brings us thread-safety.

As for perl, perl is a) dead, b) too old. Guys at Google are working on bringing LLVM goodies to CPython, which, among others, will improve GIL behavior (no complete GIL removal yet, tho): http://code.google.com/p/unladen-swallow/

12,746

Author by

JerryK

Updated on June 06, 2022

Comments

JerryK almost 2 years
Python 3.2 ALPHA is out.

From the Change Log, it appears the GIL has been entirely rewritten.

A few questions:
1. Is having a GIL good or bad? (and why).
2. Is the new GIL better? If so, how?
UPDATE:

I'm fairly new to Python. So all of this is new to my but I do at least understand that the existence of a GIL with CPython is a huge deal.

Question though, why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?
detly almost 14 years

...unless you use the multiprocessing module, which is pretty easy to do.
JerryK almost 14 years

I'm fairly new to Python but have read enough to at least understand the GIL is a big deal, which is why I'm asking the question.
Gabe almost 14 years

...but multiprocessing is no good for fine-grained parallelism.
user1066101 almost 14 years

@JerryK: Your original pair of questions were too vague to provide any useful information, which is why I provided a non-answer. Please try to be specific in what you need to know. Vague questions are difficult to answer.
user1066101 almost 14 years

@Gabe: But "fine-grained" parallelism is often over-rated. OS process-level parallelism often works out just.
Matt Joiner almost 14 years

JerryK: On the contrary, the GIL is generally not a big deal at all. In the general case it's more useful than painful.
Matt Joiner almost 14 years

+1 for stating fact: Perl is dead. Please don't bring it back
mpeters about 13 years

It's "Perl" (the language) or "perl" (the interpreter) but never "PERL".
user1066101 about 13 years

@mpeters: I started using perl in the 90's when it was often written PERL because we still thought it was an acronym. Old habits die hard.
Basic over 9 years

@S.Lott Not really. Not if you actually want to do serious work as opposed to being glue for other things doing serious work. Plus multiprocessing has an overhead in terms of system resources. Then again, if you're doing heavy lifting, I suppose you'd avoid Python in the first place.
Basic over 9 years

"It's necessary for thread synchronization"? Prove it. It may be how Python chose to handle thread synch but it's not in Java, .Net, C++ or dozens of other languages which multithread perfectly well. Yes, it prevents people who don't know how to use threads from shooting themselves in the foot and keeps the language design simple. It was a design decision, nothing more (and a poor on IMHO)
Kobor42 almost 9 years

Fine-grained parallelism is bad coding. You should parallelise tasks that are independent.
Waelmas about 8 years

@Kobor42 Can you substantiate the claim that fine-grained parallelism is bad coding? Or perhaps you mean that fine-grained parallelism is not very efficient currently?
Kobor42 about 8 years

@Paul Basic mentioned merge sort, and it is good example for bad fine-grained parallelism. Writing mergesort is easy. Writing safe and efficient parallelised mergesort is hard and complicated. Fine-grained parallelism always demands a good expert. Maintenance is hard. Bugs raise easy, and gets fixed hard. Debugging is hard. Threads always die painful and silent. So thread jobs should be simple and fool-proof. Sorting data, querying data and displaying data are different tasks - make them able to run parallel.
Kobor42 about 8 years

@Basic I just answered to Paul's question, but it's also an answer for your comment too. See above - SO doesn't allow multiple replies.
Basic about 8 years

@Kobor42 [Thanks for letting me know about the resonse] So your approach is that since writing multi-threaded code requires a minimum level of competence, everyone should avoid it? What would you do instead of a merge sort? Just do it on a single thread and wait longer while the other processors are idle? Or multi-process it and live with significant performance loss? Personally, I'd prefer to hire developers who know how to use resources efficiently
Kobor42 about 8 years

@Basic Please forgive me, that I worked at more big companies on sources with millions lines which were once started as "just try if it works", along with a lot of juniors on projects, and with tons of mistakes like what I described above. Yes. My approach is that since writing good multithreaded code is HARD (not minimum level of competence) it should be best practice to keep it easy, IF (!!!) you plan to use the code longer than a week.
Honinbo Shusaku over 7 years

@Basic Since Gabe doesn't seem to be active on SO anymore: If my understanding of fine-grained parallelism is correct (where threads talk to each other a lot), then why is multiprocessing bad for fine-grained parallelism? Because they generally don't share memory (be default) and are heavier than threads?
Basic over 7 years

@Abdul Mostly. Ignoring the additional resource overhead of processes on windows, passing messages between procs involves a step where data is pickled/unpickled. All processes usually need their own copy of the data set in memory (as no memory is shared), etc, etc... The more granular the work, the greater the overhead and waster resources. merge sort is a good example of this use type.
Basic over 7 years

Sure you can work around some of this using memory mapped files and the like but frankly, the whole thing is a hack to work around python's threading deficiencies. I can't help but feel "it's hard" is just a rationalisation. Can you think of other situations where developers avoid whole capabilities because they're difficult? (as opposed to "because the language is poorly designed to support this capability")
Honinbo Shusaku over 7 years

@Basic Your question is rhetorical to prove a point, but this reminds me also of memory allocation and de-allocation
ChrisGuest about 6 years

Perl seems to be in a superposition of states.
Teekin almost 5 years

@ChrisGuest: That's also probably where it feels most at home.