How to await the results of an IAsyncEnumerable<Task<T>>, with a specific level of concurrency

此生再无相见时 提交于 2020-03-21 06:51:02

问题


I have an asynchronous stream of tasks, that is generated by applying an async lambda to a stream of items:

IAsyncEnumerable<int> streamOfItems = AsyncEnumerable.Range(1, 10);
IAsyncEnumerable<Task<string>> streamOfTasks = streamOfItems.Select(async x =>
{
    await Task.Delay(100);
    return x.ToString();
})

The methods AsyncEnumerable.Range and Select above are provided from the System.Linq.Async package.

The result I want is a stream of results, expressed as an IAsyncEnumerable<string>. The results must be streamed in the same order as the originated tasks. Also the enumeration of the stream must be throttled, so than no more than a specified number of tasks are active at any given time.

I would like a solution in the form of an extension method on the IAsyncEnumerable<Task<T>> type, so that I could chain it multiple times and form a processing pipeline, similar in functionality with a TPL Dataflow pipeline, but expressed fluently. Below is the signature of the desirable extension method:

public async static IAsyncEnumerable<TResult> AwaitResults<TResult>(
    this IAsyncEnumerable<Task<TResult>> source,
    int concurrencyLevel);

Accepting also a CancellationToken as argument would be a nice feature.


Update: For completeness I am including an example of a fluent processing pipeline formed by chaining twice the AwaitResults method. This pipeline starts with a PLINQ block, just to demonstrate that mixing PLINQ and Linq.Async is possible.

int[] results = await Enumerable.Range(1, 20)
    .AsParallel()
    .AsOrdered()
    .WithDegreeOfParallelism(2)
    .WithMergeOptions(ParallelMergeOptions.NotBuffered)
    .Select(x =>
    {
        Thread.Sleep(100); // Simulate some CPU-bound operation
        return x;
    })
    .ToAsyncEnumerable()
    .Select(async x =>
    {
        await Task.Delay(300); // Simulate some I/O operation
        return x;
    })
    .AwaitResults(concurrencyLevel: 5)
    .Select(x => Task.Run(() =>
    {
        Thread.Sleep(100); // Simulate another CPU-bound operation
        return x;
    }))
    .AwaitResults(concurrencyLevel: 2)
    .ToArrayAsync();
Console.WriteLine($"Results: {String.Join(", ", results)}");

Expected output:

Results: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20


回答1:


Here is my implementation of the AwaitResults method. It is based on a SemaphoreSlim for controlling the concurrency level, and on a Channel<Task<TResult>> that is used as an async queue. The enumeration of the source IAsyncEnumerable<Task<TResult>> happens inside a fire-and-forget task (the feeder), that pushes the hot tasks to the channel. It also attaches a continuation to each task, where the semaphore is released.

The last part of the method is the yielding loop, where the tasks are dequeued from the channel one by one, and then awaited sequentially. This way the results are yielded in the same order as the tasks in the source stream.

This implementation requires that each task is awaited twice, which means that it couldn't be used for a source of type IAsyncEnumerable<ValueTask<TResult>>, since a ValueTask can only be awaited once.

public async static IAsyncEnumerable<TResult> AwaitResults<TResult>(
    this IAsyncEnumerable<Task<TResult>> source,
    int concurrencyLevel = 1,
    [EnumeratorCancellation]CancellationToken cancellationToken = default)
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (concurrencyLevel < 1)
        throw new ArgumentOutOfRangeException(nameof(concurrencyLevel));

    var semaphore = new SemaphoreSlim(concurrencyLevel - 1);
    var channelCapacity = Math.Max(1000, concurrencyLevel * 10);
    var tasksChannel = Channel.CreateBounded<Task<TResult>>(channelCapacity);
    var completionCts = new CancellationTokenSource();

    // Feeder task: fire and forget
    _ = Task.Run(async () =>
    {
        try
        {
            using var linkedTokenSource = CancellationTokenSource
                .CreateLinkedTokenSource(cancellationToken, completionCts.Token);
            await foreach (var task in source
                .WithCancellation(linkedTokenSource.Token).ConfigureAwait(false))
            {
                HandleTaskCompletion(task);
                await tasksChannel.Writer.WriteAsync(task, linkedTokenSource.Token)
                    .ConfigureAwait(false);
                await semaphore.WaitAsync(linkedTokenSource.Token)
                    .ConfigureAwait(false); // Acquire before MoveNextAsync
            }
            tasksChannel.Writer.Complete();
        }
        catch (Exception ex)
        {
            tasksChannel.Writer.Complete(ex);
        }
    });

    async void HandleTaskCompletion(Task task)
    {
        try
        {
            await task.ConfigureAwait(false);
        }
        catch
        {
            // Ignore exceptions here
        }
        finally
        {
            semaphore.Release();
        }
    }

    try
    {
        while (await tasksChannel.Reader.WaitToReadAsync(cancellationToken)
            .ConfigureAwait(false))
        {
            while (tasksChannel.Reader.TryRead(out var task))
            {
                yield return await task.ConfigureAwait(false);
            }
        }
    }
    finally // Happens when the caller disposes the output enumerator
    {
        completionCts.Cancel();
    }
}

An important detail is the try-finally block around the final yielding loop. This is required for the case that the caller of the method abandons prematurely the enumeration of the resulting stream. In that case the enumeration of the source stream should also be terminated, and this termination is propagated backward using a CancellationTokenSource. Without it the feeder task would never be completed, the objects would never be garbage collected, and memory would be leaked.

Note: Cancelling the cancellationToken may not cancel the whole operation instantaneously. For maximum responsiveness the same cancellationToken should be used for cancelling the individual tasks.



来源:https://stackoverflow.com/questions/60375187/how-to-await-the-results-of-an-iasyncenumerabletaskt-with-a-specific-level

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!