Lock handler for arbitrary keys

删除回忆录丶 提交于 2019-12-18 11:11:20

问题


I have code which implements a "lock handler" for arbitrary keys. Given a key, it ensures that only one thread at a time can process that(or equals) key (which here means calling the externalSystem.process(key) call).

So far, I have code like this:

public class MyHandler {
    private final SomeWorkExecutor someWorkExecutor;
    private final ConcurrentHashMap<Key, Lock> lockMap = new ConcurrentHashMap<>();

    public void handle(Key key) {
        // This can lead to OOM as it creates locks without removing them
        Lock keyLock = lockMap.computeIfAbsent( 
            key, (k) -> new ReentrantLock()
        );
        keyLock.lock();
        try {
            someWorkExecutor.process(key);
        } finally {
            keyLock.unlock();
        }
    }
}

I understand that this code can lead to the OutOfMemoryError because no one clear map.

I think about how to make map which will accumulate limited count of elements. When limit will be exceeded then we should replace oldest access element with new(this code should synchronized with oldest element as monitor). But I don't know how to have callback which will say me that limit exceeded.

Please share your thoughts.

P.S.

I reread the task and now I see that I have limitation that handle method cannot be invoked more than 8 threads. I don't know how can it help me but I just mentioned it.

P.S.2

by @Boris the Spider was suggested nice and simple solution:

} finally {
      lockMap.remove(key);
      keyLock.unlock();
}

But after Boris noticed that code us not thread safe because it break behavior:
lets research 3 threads invoked with equally key:

  1. Thread#1 acquire the lock and now before map.remove(key);
  2. Thread#2 invokes with equals key so it wait when thread#1 release lock.
  3. then thread#1 execute map.remove(key);. After this thread#3 invokes method handle. It checks that lock for this key is absent in map thus it creates new lock and acquires it.
  4. Thread#1 releases the lock and thus thread#2 acquires it.
    Thus thread#2 and thread#3 can be invoked in parallel for equals keys. But it should not be allowed.

To avoid this situation, before map clearing we should block any thread to acquire the lock while all threads from waitset is not acquire and release the lock. Looks like it is enough complicated synchronization needed and it will lead to slow algorithm working. Maybe we should clear map from time to time when map size exceeds some limited value.

I wasted a lot of time but unfortunately I have not ideas how to achieve this.


回答1:


You don't need to try to limit the size to some arbitrary value - as it turns out, you can accomplish this kind of "lock handler" idiom while only storing exactly the number of keys currently locked in the map.

The idea is to use a simple convention: successfully adding the mapping to the map counts as the "lock" operation, and removing it counts as the "unlock" operation. This neatly avoids the issue of removing a mapping while some thread still has it locked and other race conditions.

At this point, the value in the mapping is only used to block other threads who arrive with the same key and need to wait until the mapping is removed.

Here's an example1 with CountDownLatch rather than Lock as the map value:

public void handle(Key key) throws InterruptedException {
    CountDownLatch latch = new CountDownLatch(1);

    // try to acquire the lock by inserting our latch as a
    // mapping for key        
    while(true) {
        CountDownLatch existing = lockMap.putIfAbsent(key, latch);
        if (existing != null) {
            // there is an existing key, wait on it
            existing.await();
        } else {
            break;
        }
    }

    try {
        externalSystem.process(key);
    } finally {
        lockMap.remove(key);
        latch.countDown();
    }
}

Here, the lifetime of the mapping is only as long as the lock is held. The map will never have more entries than there are concurrent requests for different keys.

The difference with your approach is that the mappings are not "re-used" - each handle call will create a new latch and mapping. Since you are already doing expensive atomic operations, this isn't likely to be much of a slowdown in practice. Another downside is that with many waiting threads, all are woken when the latch counts down, but only one will succeed in putting a new mapping in and hence acquiring the lock - the rest go back to sleep on the new lock.

You could build another version of this which re-uses the mappings when threads coming along and wait on an existing mapping. Basically, the unlocking thread just does a "handoff" to one of the waiting threads. Only one mapping will be used for an entire set of threads that wait on the same key - it is handed off to each one in sequence. The size is still bounded because one no more threads are waiting on a given mapping it is still removed.

To implement that, you replace the CountDownLatch with a map value that can count the number of waiting threads. When a thread does the unlock, it first checks to see if any threads are waiting, and if so wakes one to do the handoff. If no threads are waiting, it "destroys" the object (i.e., sets a flag that the object is no longer in the mapping) and removes it from the map.

You need to do the above manipulations under a proper lock, and there are a few tricky details. In practice I find the short and sweet example above works great.


1 Written on the fly, not compiled and not tested, but the idea works.




回答2:


You could rely on the method compute(K key, BiFunction<? super K,? super V,? extends V> remappingFunction) to synchronize calls to your method process for a given key, you don't even need anymore to use Lock as type of the values of your map as you don't rely on it anymore.

The idea is to rely on the internal locking mechanism of your ConcurrentHashMap to execute your method, this will allow threads to execute in parallel the process method for keys whose corresponding hashes are not part of the same bin. This equivalent to the approach based on striped locks except that you don't need additional third party library.

The striped locks' approach is interesting because it is very light in term of memory footprint as you only need a limited amount of locks to do it, so the memory footprint needed for your locks is known and never changes, which is not the case of approaches that use one lock for each key (like in your question) such that it is generally better/recommended to use approaches based on striped locks for such need.

So your code could be something like this:

// This will create a ConcurrentHashMap with an initial table size of 16   
// bins by default, you may provide an initialCapacity and loadFactor
// if too much or not enough to get the expected table size in order
// increase or reduce the concurrency level of your map
// NB: We don't care much of the type of the value so I arbitrarily
// used Void but it could be any type like simply Object
private final ConcurrentMap<Key, Void> lockMap = new ConcurrentHashMap<>();

public void handle(Key lockKey) {
    // Execute the method process through the remapping Function
    lockMap.compute(
        lockKey,
        (key, value) -> {
            // Execute the process method under the protection of the
            // lock of the bin of hashes corresponding to the key
            someWorkExecutor.process(key);
            // Returns null to keep the Map empty
            return null;
        }
    );
}

NB 1: As we always returns null the map will always be empty such that you will never run out of memory because of this map.

NB 2: As we never affect a value to a given key, please note that it could also be done using the method computeIfAbsent(K key, Function<? super K,? extends V> mappingFunction):

public void handle(Key lockKey) {
    // Execute the method process through the remapping Function
    lockMap.computeIfAbsent(
        lockKey,
        key -> {
            // Execute the process method under the protection of the
            // lock of the segment of hashes corresponding to the key
            someWorkExecutor.process(key);
            // Returns null to keep the Map empty
            return null;
        }
    );
}

NB 3: Make sure that your method process never calls the method handle for any keys as you would end up with infinite loops (same key) or deadlocks (other non ordered keys, for example: If one thread calls handle(key1) and then process internally calls handle(key2) and another thread calls in parallel handle(key2) and then process internally calls handle(key1), you will get a deadlock whatever the approach used). This behavior is not specific to this approach, it will occur with any approaches.




回答3:


One approach is to dispense with the concurrent hash map entirely, and just use a regular HashMap with locking to perform the required manipulation of the map and lock state atomically.

At first glance, this seems to reduce the concurrency of the system, but if we assume that the process(key) call is lengthy relative the very fast lock manipulations, it works well because the process() calls still run concurrently. Only a small and fixed amount of work occurs in the exclusive critical section.

Here's a sketch:

public class MyHandler {

    private static class LockHolder {
        ReentrantLock lock = new ReentrantLock();
        int refcount = 0;
        void lock(){
            lock.lock();
        }
    } 

    private final SomeWorkExecutor someWorkExecutor;
    private final Lock mapLock = new ReentrantLock();
    private final HashMap<Key, LockHolder> lockMap = new HashMap<>();

    public void handle(Key key) {

        // lock the map
        mapLock.lock();
        LockHolder holder = lockMap.computeIfAbsent(key, k -> new LockHolder());
        // the lock in holder is either unlocked (newly created by us), or an existing lock, let's increment refcount
        holder.refcount++;
        mapLock.unlock();

        holder.lock();

        try {
            someWorkExecutor.process(key);
        } finally {
            mapLock.lock()
            keyLock.unlock();
            if (--holder.refcount == 0) {
              // no more users, remove lock holder
              map.remove(key);
            }
            mapLock.unlock();
        }
    }
}

We use refcount, which is only manipulated under the shared mapLock to keep track of how many users of the lock there are. Whenever the refcount is zero, we can get rid of the entry as we exit the handler. This approach is nice in that it is fairly easy to reason about and will perform well if the process() call is relatively expensive compared to the locking overhead. Since the map manipulation occurs under a shared lock, it is also straightforward to add additional logic, e.g., keeping some Holder objects in the map, keeping track of statistics, etc.




回答4:


Thanks Ben Mane
I have found this variant.

public class MyHandler {
    private final int THREAD_COUNT = 8;
    private final int K = 100;
    private final Striped<Lock> striped = Striped.lazyWeakLock(THREAD_COUNT * K);
    private final SomeWorkExecutor someWorkExecutor = new SomeWorkExecutor();

    public void handle(Key key) throws InterruptedException {
        Lock keyLock = striped.get(key);

        keyLock.lock();
        try {
            someWorkExecutor.process(key);
        } finally {
            keyLock.unlock();
        }       
    }
}



回答5:


Here's a short and sweet version that leverages the weak version of Guava's Interner class to do the heavily lifting of coming up with a "canonical" object for each key to use as the lock, and implementing weak reference semantics so that unused entries are cleaned up.

public class InternerHandler {
    private final Interner = Interners.newWeakInterner();

    public void handle(Key key) throws InterruptedException {
        Key canonKey = Interner.intern(key);
        synchronized (canonKey) {
            someWorkExecutor.process(key);
        }       
    }
}

Basically we ask for a canonical canonKey which is equal() to key, and then lock on this canonKey. Everyone will agree on the canonical key and hence all callers that pass equal keys will agree on the object on which to lock.

The weak nature of the Interner means that any time the canonical key isn't being used, the entry can be removed, so you avoid accumulation of entries in the interner. Later, if an equal key again comes in, a new canonical entry is chosen.

The simple code above relies on the built-in monitor to synchronize - but if this doesn't work for you (e.g., it's already used for another purpose) you can include a lock object in the Key class or create a holder object.




回答6:


class MyHandler {
    private final Map<Key, Lock> lockMap = Collections.synchronizedMap(new WeakHashMap<>());
    private final SomeWorkExecutor someWorkExecutor = new SomeWorkExecutor();

    public void handle(Key key) throws InterruptedException {
        Lock keyLock = lockMap.computeIfAbsent(key, (k) -> new ReentrantLock()); 
        keyLock.lock();
        try {
            someWorkExecutor.process(key);
        } finally {
            keyLock.unlock();
        }
    }
}



回答7:


Creating and removing the lock object for a key each time is an costly operation in term of performance. When you do add/remove lock from concurrent map (say cache), it have to be ensure that putting/removing object from cache is itself thread-safe. So this seems not good idea but can be implemented via ConcurrentHashMap

Strip locking approach (also used by concurrent hash map internally) is better approach. From Google Guava docs it is explained as

When you want to associate a lock with an object, the key guarantee you need is that if key1.equals(key2), then the lock associated with key1 is the same as the lock associated with key2.

The crudest way to do this is to associate every key with the same lock, which results in the coarsest synchronization possible. On the other hand, you can associate every distinct key with a different lock, but this requires linear memory consumption and concurrency management for the system of locks itself, as new keys are discovered.

Striped allows the programmer to select a number of locks, which are distributed between keys based on their hash code. This allows the programmer to dynamically select a tradeoff between concurrency and memory consumption, while retaining the key invariant that if key1.equals(key2), then striped.get(key1) == striped.get(key2)

code:

//declare globally; e.g. class field level
Striped<Lock> rwLockStripes = Striped.lock(16);

    Lock lock = rwLockStripes.get("key");
    lock.lock();
    try {
        // do you work here
    } finally {
        lock.unlock();
    }

Following snipped of code can help in implementing the putting/removal of lock.

private ConcurrentHashMap<String, ReentrantLock> caches = new ConcurrentHashMap<>();

public void processWithLock(String key) {
    ReentrantLock lock = findAndGetLock(key);
    lock.lock();
    try {
        // do you work here

    } finally {
        unlockAndClear(key, lock);
    }
}

private void unlockAndClear(String key, ReentrantLock lock) {
    // *** Step 1: Release the lock.
    lock.unlock();
    // *** Step 2: Attempt to remove the lock
    // This is done by calling compute method, if given lock is present in
    // cache. if current lock object in cache is same instance as 'lock'
    // then remove it from cache. If not, some other thread is succeeded in
    // putting new lock object and hence we can leave the removal of lock object to that
    // thread.
    caches.computeIfPresent(key, (k, current) -> lock == current ? null : current);

}

private ReentrantLock findAndGetLock(String key) {
    // Merge method given us the access to the previously( if available) and
    // newer lock object together.
    return caches.merge(key, new ReentrantLock(), (older, newer) -> nonNull(older) ? older : newer);
}



回答8:


Instead of writing you own you might try something like JKeyLockManager. From the projects description:

JKeyLockManager provides fine-grained locking with application specific keys.

Example code given on site:

public class WeatherServiceProxy {
  private final KeyLockManager lockManager = KeyLockManagers.newManager();

  public void updateWeatherData(String cityName, float temperature) {
    lockManager.executeLocked(cityName, () -> delegate.updateWeatherData(cityName, temperature)); 
  }



回答9:


New values will be added when you call

lockMap.computeIfAbsent()

So you can just check lockMap.size() for item count.

But How are you going to find first added item? it would be better just remove items after you used them.




回答10:


You can use an in process cache that stores object references, like Caffeine, Guava, EHCache or cache2k. Here is an example how to build a cache with cache2k:

final Cache<Key, Lock> locks =
  new Cache2kBuilder<Key, Lock>(){}
    .loader(
      new CacheLoader<Key, Lock>() {
        @Override
        public Lock load(Key o) {
          return new ReentrantLock();
        }
      }
    )
    .storeByReference(true)
    .entryCapacity(1000)
    .build();

The usage pattern is as you have in the question:

    Lock keyLock = locks.get(key);
    keyLock.lock();
    try {
        externalSystem.process(key);
    } finally {
        keyLock.unlock();
    }

Since the cache is limited to 1000 entries, there is an automatically cleanup of locks that are not in use any more.

There is the potential that a lock in use is evicted by the cache, if the capacity and the number of threads in the application are mismatching. This solution works perfectly for years in our applications. The cache will evict a lock that is in use, when there is a sufficiently long running task AND the capacity is exceeded. In a real application you always control the number of life threads, e.g. in a web container you would limit the number of processing threads to (example) 100. So you know that there are never more then 100 locks in use. If this is accounted for, this solution has a minimum overhead.

Keep in mind that the locking only works as long as your application runs on a single VM. You may want to take a look at distributed lock managers (DLM). Examples for products that provide distributed locks: hazelcast, infinispan, teracotta, redis/redisson.



来源:https://stackoverflow.com/questions/41898355/lock-handler-for-arbitrary-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!