python threadsafe object cache

前端 未结 6 928
感动是毒
感动是毒 2020-12-29 09:29

I have implemented a python webserver. Each http request spawns a new thread. I have a requirement of caching objects in memory and since its a webserver, I want the cache t

6条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-29 10:17

    Point 1. GIL does not help you here, an example of a (non-thread-safe) cache for something called "stubs" would be

    stubs = {}
    
    def maybe_new_stub(host):
        """ returns stub from cache and populates the stubs cache if new is created """
        if host not in stubs:
            stub = create_new_stub_for_host(host)
            stubs[host] = stub
        return stubs[host]
    

    What can happen is that Thread 1 calls maybe_new_stub('localhost'), and it discovers we do not have that key in the cache yet. Now we switch to Thread 2, which calls the same maybe_new_stub('localhost'), and it also learns the key is not present. Consequently, both threads call create_new_stub_for_host and put it into the cache.

    The map itself is protected by the GIL, so we cannot break it by concurrent access. The logic of the cache, however, is not protected, and so we may end up creating two or more stubs, and dropping all except one on the floor.

    Point 2. Depending on the nature of the program, you may not want a global cache. Such shared cache forces synchronization between all your threads. For performance reasons, it is good to make the threads as independent as possible. I believe I do need it, you may actually not.

    Point 3. You may use a simple lock. I took inspiration from https://codereview.stackexchange.com/questions/160277/implementing-a-thread-safe-lrucache and came up with the following, which I believe is safe to use for my purposes

    import threading
    
    stubs = {}
    lock = threading.Lock()
    
    
    def maybe_new_stub(host):
        """ returns stub from cache and populates the stubs cache if new is created """
        with lock:
            if host not in stubs:
                channel = grpc.insecure_channel('%s:6666' % host)
                stub = cli_pb2_grpc.BrkStub(channel)
                stubs[host] = stub
            return stubs[host]
    

    Point 4. It would be best to use existing library. I haven't found any I am prepared to vouch for yet.

提交回复
热议问题