问题
The app needs to load data and cache it for a period of time. I would expect that if multiple parts of the app want to access the same cache key at the same time, the cache should be smart enough to only load the data once and return the result of that call to all callers. However, MemoryCache is not doing this. If you hit the cache in parallel (which often happens in the app) it creates a task for each attempt to get the cache value. I thought that this code would achieve the desired result, but it doesn't. I would expect the cache to only run one GetDataAsync
task, wait for it to complete, and use the result to get the values for other calls.
using Microsoft.Extensions.Caching.Memory;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
private const string Key = "1";
private static int number = 0;
static async Task Main(string[] args)
{
var memoryCache = new MemoryCache(new MemoryCacheOptions { });
var tasks = new List<Task>();
tasks.Add(memoryCache.GetOrCreateAsync(Key, (cacheEntry) => GetDataAsync()));
tasks.Add(memoryCache.GetOrCreateAsync(Key, (cacheEntry) => GetDataAsync()));
tasks.Add(memoryCache.GetOrCreateAsync(Key, (cacheEntry) => GetDataAsync()));
await Task.WhenAll(tasks);
Console.WriteLine($"The cached value was: {memoryCache.Get(Key)}");
}
public static async Task<int> GetDataAsync()
{
//Simulate getting a large chunk of data from the database
await Task.Delay(3000);
number++;
Console.WriteLine(number);
return number;
}
}
}
That's not what happens. The above displays these results (not necessarily in this order):
2
1
3
The cached value was: 3
It creates a task for each cache request and discards the values returned from the other two.
This needlessly spends time and it makes me wonder if you can say this class is even thread-safe. ConcurrentDictionary
has the same behaviour. I tested it and the same thing happens.
Is there a way to achieve the desired behaviour where the task doesn't run 3 times?
回答1:
There are different solutions available, the most famous of which is probably LazyCache: it's a great library.
Another one that you may find useful is FusionCache ⚡🦥, which I recently released: it has the exact same feature (although implemented differently) and much more.
The feature you are looking for is described here and you can use it like this:
var result = await fusionCache.GetOrSetAsync(
Key,
_ => await GetDataAsync(),
TimeSpan.FromMinutes(2)
);
You may also find some of the other features interesting, like fail-safe, advanced timeouts with background factory completion and support for an optional, distributed 2nd level.
If you will give it a chance please let me know what you think.
/shameless-plug
回答2:
MemoryCache
leaves it to you to decide how to handle races to populate a cache key. In your case you don't want multiple threads to compete to populate a key presumably because it's expensive to do that.
To coordinate the work of multiple threads like that you need a lock, but using a C# lock
statement in asynchronous code can lead to thread pool starvation. Fortunately, SemaphoreSlim
provides a way to do async locking so it becomes a matter of creating a guarded memory cache that wraps an underlying IMemoryCache
.
My first solution only had a single semaphore for the entire cache putting all cache population tasks in a single line which isn't very smart so instead here is more elaborate solution with a semaphore for each cache key. Another solution could be to have a fixed number of semaphores picked by a hash of the key.
sealed class GuardedMemoryCache : IDisposable
{
readonly IMemoryCache cache;
readonly ConcurrentDictionary<object, SemaphoreSlim> semaphores = new();
public GuardedMemoryCache(IMemoryCache cache) => this.cache = cache;
public async Task<TItem> GetOrCreateAsync<TItem>(object key, Func<ICacheEntry, Task<TItem>> factory)
{
var semaphore = GetSemaphore(key);
await semaphore.WaitAsync();
try
{
return await cache.GetOrCreateAsync(key, factory);
}
finally
{
semaphore.Release();
RemoveSemaphore(key);
}
}
public object Get(object key) => cache.Get(key);
public void Dispose()
{
foreach (var semaphore in semaphores.Values)
semaphore.Release();
}
SemaphoreSlim GetSemaphore(object key) => semaphores.GetOrAdd(key, _ => new SemaphoreSlim(1));
void RemoveSemaphore(object key)
{
if (semaphores.TryRemove(key, out var semaphore))
semaphore.Dispose();
}
}
If multiple threads try to populate the same cache key only a single thread will actually do it. The other threads will instead return the value that was created.
Assuming that you use dependency injection, you can let GuardedMemoryCache
implement IMemoryCache
by adding a few more methods that forward to the underlying cache to modify the caching behavior throughout your application with very few code changes.
回答3:
Here is a custom extension method GetOrCreateExclusiveAsync
, similar to the native IMemoryCache.GetOrCreateAsync, that prevents concurrent invocations of the supplied asynchronous lambda under normal conditions. The intention is to enhance the efficiency of the caching mechanism under heavy usage. There is still the possibility for concurrency to occur, so this is not a substitute for thread synchronization (if needed).
This implementation also evicts faulted tasks from the cache, so that the failed asynchronous operations are subsequently retried.
using Microsoft.Extensions.Caching.Memory;
using Microsoft.Extensions.Primitives;
/// <summary>
/// Returns an entry from the cache, or creates a new cache entry using the
/// specified asynchronous factory method. Concurrent invocations are prevented,
/// unless the entry is evicted before the completion of the delegate. The errors
/// of failed invocations are not cached.
/// </summary>
public static Task<T> GetOrCreateExclusiveAsync<T>(this IMemoryCache cache, object key,
Func<Task<T>> factory, MemoryCacheEntryOptions options = null)
{
if (!cache.TryGetValue(key, out Task<T> task))
{
var entry = cache.CreateEntry(key);
if (options != null) entry.SetOptions(options);
var cts = new CancellationTokenSource();
var newTaskTask = new Task<Task<T>>(async () =>
{
try { return await factory().ConfigureAwait(false); }
catch { cts.Cancel(); throw; }
finally { cts.Dispose(); }
});
var newTask = newTaskTask.Unwrap();
entry.ExpirationTokens.Add(new CancellationChangeToken(cts.Token));
entry.Value = newTask;
entry.Dispose(); // The Dispose actually inserts the entry in the cache
if (!cache.TryGetValue(key, out task)) task = newTask;
if (task == newTask)
newTaskTask.RunSynchronously(TaskScheduler.Default);
else
cts.Dispose();
}
return task;
}
Usage example:
var cache = new MemoryCache(new MemoryCacheOptions());
string html = await cache.GetOrCreateExclusiveAsync(url, async () =>
{
return await httpClient.GetStringAsync(url);
}, new MemoryCacheEntryOptions().SetAbsoluteExpiration(TimeSpan.FromMinutes(10)));
This implementation uses nested tasks (Task<Task<T>>
) instead of lazy tasks (Lazy<Task<T>>
) internally as wrappers, because the later construct is susceptible to deadlocks under some conditions.
Reference: Lazy<Task> with asynchronous initialization, VSTHRD011 Use AsyncLazy.
Related API suggestion on GitHub: GetOrCreateExclusive() and GetOrCreateExclusiveAsync(): Exclusive versions of GetOrCreate() and GetOrCreateAsync()
来源:https://stackoverflow.com/questions/65640644/stop-reentrancy-on-memorycache-calls