Task caching when performing Tasks in parallel with WhenAll

落花浮王杯 提交于 2019-12-07 08:15:11

问题


So I have this small code block that will perform several Tasks in parallel.

// no wrapping in Task, it is async
var activityList = await dataService.GetActivitiesAsync();

// Select a good enough tuple
var results = (from activity in activityList
               select new { 
                Activity = activity, 
                AthleteTask = dataService.GetAthleteAsync(activity.AthleteID)
               }).ToList(); // begin enumeration

// Wait for them to finish, ie relinquish control of the thread
await Task.WhenAll(results.Select(t => t.AthleteTask));

// Set the athletes
foreach(var pair in results)
{
  pair.Activity.Athlete = pair.AthleteTask.Result;
}

So I'm downloading Athlete data for each given Activity. But it could be that we are requesting the same athlete several times. How can we ensure that the GetAthleteAsync method will only go online to fetch the actual data if it's not yet in our memory cache?

Currently I tried using a ConcurrentDictionary<int, Athelete> inside the GetAthleteAsync method

private async Task<Athlete> GetAthleteAsync(int athleteID)
{
       if(cacheAthletes.Contains(athleteID))
             return cacheAthletes[atheleID];

       ** else fetch from web
}

回答1:


You can change your ConcurrentDictionary to cache the Task<Athlete> instead of just the Athlete. Remember, a Task<T> is a promise - an operation that will eventually result in a T. So, you can cache operations instead of results.

ConcurrentDictionary<int, Task<Athlete>> cacheAthletes;

Then, your logic will go like this: if the operation is already in the cache, return the cached task immediately (synchronously). If it's not, then start the download, add the download operation to the cache, and return the new download operation. Note that all the "download operation" logic is moved to another method:

private Task<Athlete> GetAthleteAsync(int athleteID)
{
  return cacheAthletes.GetOrAdd(athleteID, id => LoadAthleteAsync(id));
}

private async Task<Athlete> LoadAthleteAsync(int athleteID)
{
  // Load from web
}

This way, multiple parallel requests for the same athlete will get the same Task<Athlete>, and each athlete is only downloaded once.




回答2:


You also need to skip tasks, which unsuccessfuly completed. That's my snippet:

ObjectCache _cache = MemoryCache.Default;
static object _lockObject = new object();
public Task<T> GetAsync<T>(string cacheKey, Func<Task<T>> func, TimeSpan? cacheExpiration = null) where T : class
{
    var task = (T)_cache[cacheKey];
    if (task != null) return task;          
    lock (_lockObject)
    {
        task = (T)_cache[cacheKey];
        if (task != null) return task;
        task = func();
        Set(cacheKey, task, cacheExpiration);
        task.ContinueWith(t => {
            if (t.Status != TaskStatus.RanToCompletion)
                _cache.Remove(cacheKey);
        });
    }
    return task;
}i



回答3:


When caching values provided by Task-objects, you'd like to make sure the cache implementation ensures that:

  • No parallel or unnecessary operations to get a value will be started. In your case, this is your question about avoiding multiple GetAthleteAsync for the same id.
  • You don't want to have negative caching (i.e. caching failed results), or if you do want it, it needs to be a implementation decision and you need to handle eventually replacing failed results somehow.
  • Cache users can't get invalidated results from the cache, even if the value is invalidated during an await.

I have a blog post about caching Task-objects with example code, that ensures all points above and could be useful in your situation. Basically my solution is to store Lazy<Task<T>> objects in a MemoryCache.



来源:https://stackoverflow.com/questions/25507059/task-caching-when-performing-tasks-in-parallel-with-whenall

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!