What is "thread local storage" in Python, and why do I need it?

65,987

Solution 1

In Python, everything is shared, except for function-local variables (because each function call gets its own set of locals, and threads are always separate function calls.) And even then, only the variables themselves (the names that refer to objects) are local to the function; objects themselves are always global, and anything can refer to them. The Thread object for a particular thread is not a special object in this regard. If you store the Thread object somewhere all threads can access (like a global variable) then all threads can access that one Thread object. If you want to atomically modify anything that another thread has access to, you have to protect it with a lock. And all threads must of course share this very same lock, or it wouldn't be very effective.

If you want actual thread-local storage, that's where threading.local comes in. Attributes of threading.local are not shared between threads; each thread sees only the attributes it itself placed in there. If you're curious about its implementation, the source is in _threading_local.py in the standard library.

Solution 2

Consider the following code:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread, local

data = local()

def bar():
    print("I'm called from", data.v)

def foo():
    bar()

class T(Thread):
    def run(self):
        sleep(random())
        data.v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-1 

Here threading.local() is used as a quick and dirty way to pass some data from run() to bar() without changing the interface of foo().

Note that using global variables won't do the trick:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread

def bar():
    global v
    print("I'm called from", v)

def foo():
    bar()

class T(Thread):
    def run(self):
        global v
        sleep(random())
        v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-2 

Meanwhile, if you could afford passing this data through as an argument of foo() - it would be a more elegant and well-designed way:

from threading import Thread

def bar(v):
    print("I'm called from", v)

def foo(v):
    bar(v)

class T(Thread):
    def run(self):
        foo(self.getName())

But this is not always possible when using third-party or poorly designed code.

Solution 3

You can create thread local storage using threading.local().

>>> tls = threading.local()
>>> tls.x = 4 
>>> tls.x
4

Data stored to the tls will be unique to each thread which will help ensure that unintentional sharing does not occur.

Solution 4

Just like in every other language, every thread in Python has access to the same variables. There's no distinction between the 'main thread' and child threads.

One difference with Python is that the Global Interpreter Lock means that only one thread can be running Python code at a time. This isn't much help when it comes to synchronising access, however, as all the usual pre-emption issues still apply, and you have to use threading primitives just like in other languages. It does mean you need to reconsider if you were using threads for performance, however.

Solution 5

Worth mentioning threading.local() is not a singleton.

You can use more of them per thread. It is not one storage.

Share:
65,987
Mike
Author by

Mike

Updated on December 23, 2021

Comments

  • Mike
    Mike over 2 years

    In Python specifically, how do variables get shared between threads?

    Although I have used threading.Thread before I never really understood or saw examples of how variables got shared. Are they shared between the main thread and the children or only among the children? When would I need to use thread local storage to avoid this sharing?

    I have seen many warnings about synchronizing access to shared data among threads by using locks but I have yet to see a really good example of the problem.

    Thanks in advance!

  • Johann Chang
    Johann Chang almost 6 years
    Can you give more details about the following sentence please? "If you want to atomically modify anything that you didn't just create in this very same thread, and did not store anywhere another thread can get at it, you have to protect it by a lock."
  • Tom Busby
    Tom Busby over 5 years
    @changyuheng: Here is an explanation of what atomic actions are: cs.nott.ac.uk/~psznza/G52CON/lecture4.pdf
  • Johann Chang
    Johann Chang over 5 years
    @TomBusby: If there are not any other threads can get at it, why do we need to protect it by a lock, i.e. why do we need to make the process atomic?
  • variable
    variable over 4 years
    Please can you give a quick example of: "objects themselves are always global, and anything can refer to them". By refer assume you mean read and not assign/append?
  • user1071847
    user1071847 almost 4 years
    @variable: I think he means values have no scope
  • variable
    variable almost 3 years
    I didn't understand and I am still looking for a simple example to help understand this line from the answer: "objects themselves are always global, and anything can refer to them"
  • kankan256
    kankan256 about 2 years
    @variable in some programming language values are passed by reference, so you can modify variables value in upper scope(in python u can pretend this behavior by global and nonlocal) some are passed by value(so you can't change the outer scopes value, however, you can access it). but in python, all thing is object and variable are references to objects. you have access to the outer scope object but you can't change it. this is handled by the bonding mechanism. inside and outside the function access the id(x) which x bound to 5. the return id will be the same.
  • iperov
    iperov about 2 years
    threading.local().x - attribute error
  • iperov
    iperov about 2 years
    thus this is regular dotdict() and not a thread local storage at all