I have an IDictionary<TKey,TValue>
implementation that internally holds n other Dictionary<TKey, TValue>
and distributes that insertions by the HashCode of the key to the invidual sub-dictionaries. With 16 sub-dictionaries, the number of collisions is pretty low on a 4-core machine.
For parallel insertions, i locked the Add-method with a ReaderWriterLockSlim
, locking only the individual sub-dictionary:
public void Add(TKey key, TValue value)
{
int poolIndex = GetPoolIndex(key);
this.locks[poolIndex].EnterWriteLock();
try
{
this.pools[poolIndex].Add(key, value);
}
finally
{
this.locks[poolIndex].ExitWriteLock();
}
}
When inserting items with four threads, i only got about 32% cpu usage and bad performance. So i replaced the ReaderWriterLockSlim by a Monitor (i.e., the lock
keyword).
CPU usage was now at nearly 100% and the performance was more than doubled.
My question is: Why did the CPU usage increase? The number of collisions should not have changed. What makes ReaderWriterLock.EnterWriteLock wait so many times?
For write-only load the Monitor is cheaper than ReaderWriterLockSlim, however, if you simulate read + write load where read is much greater than write, then ReaderWriterLockSlim should out perform Monitor.
I'm no guru, but my guess is that RWLS is more geared towards heavy contention (e.g., hundreds of threads) whereas Monitor
is more attuned towards those one-off synchronization issues.
Personally I use a TimerLock
class that uses the Monitor.TryEnter
with a timeout parameter.
How do you know what caused the bad performance? You can't go guessing it, the only way is to do some kind of profiling.
How do you handle locking for the parent collection or is it constant?
Maybe you need to add some debug output and see what really happens?
来源:https://stackoverflow.com/questions/407238/readerwriterlockslim-vs-monitor