Thread local storage in Python

后端未结

关注

 5  1505

-上瘾入骨i

How do I use thread local storage in Python?

What is “thread local storage” in Python, and why do I need it? - This thread appears to be focuse

相关标签:

5条回答

耶瑟儿～

2020-11-27 11:13
Thread local storage is useful for instance if you have a thread worker pool and each thread needs access to its own resource, like a network or database connection. Note that the threading module uses the regular concept of threads (which have access to the process global data), but these are not too useful due to the global interpreter lock. The different multiprocessing module creates a new sub-process for each, so any global will be thread local.

threading module

Here is a simple example:
```
import threading
from threading import current_thread

threadLocal = threading.local()

def hi():
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("Nice to meet you", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

hi(); hi()
```
This will print out:
```
Nice to meet you MainThread
Welcome back MainThread
```
One important thing that is easily overlooked: a threading.local() object only needs to be created once, not once per thread nor once per function call. The global or class level are ideal locations.

Here is why: threading.local() actually creates a new instance each time it is called (just like any factory or class call would), so calling threading.local() multiple times constantly overwrites the original object, which in all likelihood is not what one wants. When any thread accesses an existing threadLocal variable (or whatever it is called), it gets its own private view of that variable.

This won't work as intended:
```
import threading
from threading import current_thread

def wont_work():
    threadLocal = threading.local() #oops, this creates a new dict each time!
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("First time for", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

wont_work(); wont_work()
```
Will result in this output:
```
First time for MainThread
First time for MainThread
```
multiprocessing module

All global variables are thread local, since the multiprocessing module creates a new process for each thread.

Consider this example, where the processed counter is an example of thread local storage:
```
from multiprocessing import Pool
from random import random
from time import sleep
import os

processed=0

def f(x):
    sleep(random())
    global processed
    processed += 1
    print("Processed by %s: %s" % (os.getpid(), processed))
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    print(pool.map(f, range(10)))
```
It will output something like this:
```
Processed by 7636: 1
Processed by 9144: 1
Processed by 5252: 1
Processed by 7636: 2
Processed by 6248: 1
Processed by 5252: 2
Processed by 6248: 2
Processed by 9144: 2
Processed by 7636: 3
Processed by 5252: 3
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
```
... of course, the thread IDs and the counts for each and order will vary from run to run.
0 讨论(0)
发布评论:

提交评论
- 加载中...

情书的邮戳

2020-11-27 11:14

As noted in the question, Alex Martelli gives a solution here. This function allows us to use a factory function to generate a default value for each thread.

#Code originally posted by Alex Martelli
#Modified to use standard Python variable name conventions
import threading
threadlocal = threading.local()    

def threadlocal_var(varname, factory, *args, **kwargs):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*args, **kwargs)
    setattr(threadlocal, varname, v)
  return v

0 讨论(0)

遥遥无期

2020-11-27 11:17
Thread-local storage can simply be thought of as a namespace (with values accessed via attribute notation). The difference is that each thread transparently gets its own set of attributes/values, so that one thread doesn't see the values from another thread.

Just like an ordinary object, you can create multiple threading.local instances in your code. They can be local variables, class or instance members, or global variables. Each one is a separate namespace.

Here's a simple example:
```
import threading

class Worker(threading.Thread):
    ns = threading.local()
    def run(self):
        self.ns.val = 0
        for i in range(5):
            self.ns.val += 1
            print("Thread:", self.name, "value:", self.ns.val)

w1 = Worker()
w2 = Worker()
w1.start()
w2.start()
w1.join()
w2.join()
```
Output:
```
Thread: Thread-1 value: 1
Thread: Thread-2 value: 1
Thread: Thread-1 value: 2
Thread: Thread-2 value: 2
Thread: Thread-1 value: 3
Thread: Thread-2 value: 3
Thread: Thread-1 value: 4
Thread: Thread-2 value: 4
Thread: Thread-1 value: 5
Thread: Thread-2 value: 5
```
Note how each thread maintains its own counter, even though the ns attribute is a class member (and hence shared between the threads).

The same example could have used an instance variable or a local variable, but that wouldn't show much, as there's no sharing then (a dict would work just as well). There are cases where you'd need thread-local storage as instance variables or local variables, but they tend to be relatively rare (and pretty subtle).
0 讨论(0)
发布评论:

提交评论
- 加载中...

温柔的废话

2020-11-27 11:22

My way of doing a thread local storage across modules / files. The following has been tested in Python 3.5 -

import threading
from threading import current_thread

# fileA.py 
def functionOne:
    thread = Thread(target = fileB.functionTwo)
    thread.start()

#fileB.py
def functionTwo():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    dictionary["localVar1"] = "store here"   #Thread local Storage
    fileC.function3()

#fileC.py
def function3():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    print (dictionary["localVar1"])           #Access thread local Storage

In fileA, I start a thread which has a target function in another module/file.

In fileB, I set a local variable I want in that thread.

In fileC, I access the thread local variable of the current thread.

Additionally, just print 'dictionary' variable so that you can see the default values available, like kwargs, args, etc.

0 讨论(0)

日久生厌

2020-11-27 11:31
Can also write
```
import threading
mydata = threading.local()
mydata.x = 1
```
mydata.x will only exist in the current thread
0 讨论(0)
发布评论:

提交评论
- 加载中...

Thread local storage in Python

Related

threading module

multiprocessing module