How to limit the amount of concurrent async I/O operations?

前端 未结 14 2508
遇见更好的自我
遇见更好的自我 2020-11-22 01:27
// let\'s say there is a list of 1000+ URLs
string[] urls = { \"http://google.com\", \"http://yahoo.com\", ... };

// now let\'s send HTTP requests to each of these          


        
相关标签:
14条回答
  • 2020-11-22 02:21

    Old question, new answer. @vitidev had a block of code that was reused almost intact in a project I reviewed. After discussing with a few colleagues one asked "Why don't you just use the built-in TPL methods?" ActionBlock looks like the winner there. https://msdn.microsoft.com/en-us/library/hh194773(v=vs.110).aspx. Probably won't end up changing any existing code but will definitely look to adopt this nuget and reuse Mr. Softy's best practice for throttled parallelism.

    0 讨论(0)
  • 2020-11-22 02:21

    Essentially you're going to want to create an Action or Task for each URL that you want to hit, put them in a List, and then process that list, limiting the number that can be processed in parallel.

    My blog post shows how to do this both with Tasks and with Actions, and provides a sample project you can download and run to see both in action.

    With Actions

    If using Actions, you can use the built-in .Net Parallel.Invoke function. Here we limit it to running at most 20 threads in parallel.

    var listOfActions = new List<Action>();
    foreach (var url in urls)
    {
        var localUrl = url;
        // Note that we create the Task here, but do not start it.
        listOfTasks.Add(new Task(() => CallUrl(localUrl)));
    }
    
    var options = new ParallelOptions {MaxDegreeOfParallelism = 20};
    Parallel.Invoke(options, listOfActions.ToArray());
    

    With Tasks

    With Tasks there is no built-in function. However, you can use the one that I provide on my blog.

        /// <summary>
        /// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
        /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
        /// </summary>
        /// <param name="tasksToRun">The tasks to run.</param>
        /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
        /// <param name="cancellationToken">The cancellation token.</param>
        public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, CancellationToken cancellationToken = new CancellationToken())
        {
            await StartAndWaitAllThrottledAsync(tasksToRun, maxTasksToRunInParallel, -1, cancellationToken);
        }
    
        /// <summary>
        /// Starts the given tasks and waits for them to complete. This will run the specified number of tasks in parallel.
        /// <para>NOTE: If a timeout is reached before the Task completes, another Task may be started, potentially running more than the specified maximum allowed.</para>
        /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
        /// </summary>
        /// <param name="tasksToRun">The tasks to run.</param>
        /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
        /// <param name="timeoutInMilliseconds">The maximum milliseconds we should allow the max tasks to run in parallel before allowing another task to start. Specify -1 to wait indefinitely.</param>
        /// <param name="cancellationToken">The cancellation token.</param>
        public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, int timeoutInMilliseconds, CancellationToken cancellationToken = new CancellationToken())
        {
            // Convert to a list of tasks so that we don't enumerate over it multiple times needlessly.
            var tasks = tasksToRun.ToList();
    
            using (var throttler = new SemaphoreSlim(maxTasksToRunInParallel))
            {
                var postTaskTasks = new List<Task>();
    
                // Have each task notify the throttler when it completes so that it decrements the number of tasks currently running.
                tasks.ForEach(t => postTaskTasks.Add(t.ContinueWith(tsk => throttler.Release())));
    
                // Start running each task.
                foreach (var task in tasks)
                {
                    // Increment the number of tasks currently running and wait if too many are running.
                    await throttler.WaitAsync(timeoutInMilliseconds, cancellationToken);
    
                    cancellationToken.ThrowIfCancellationRequested();
                    task.Start();
                }
    
                // Wait for all of the provided tasks to complete.
                // We wait on the list of "post" tasks instead of the original tasks, otherwise there is a potential race condition where the throttler's using block is exited before some Tasks have had their "post" action completed, which references the throttler, resulting in an exception due to accessing a disposed object.
                await Task.WhenAll(postTaskTasks.ToArray());
            }
        }
    

    And then creating your list of Tasks and calling the function to have them run, with say a maximum of 20 simultaneous at a time, you could do this:

    var listOfTasks = new List<Task>();
    foreach (var url in urls)
    {
        var localUrl = url;
        // Note that we create the Task here, but do not start it.
        listOfTasks.Add(new Task(async () => await CallUrl(localUrl)));
    }
    await Tasks.StartAndWaitAllThrottledAsync(listOfTasks, 20);
    
    0 讨论(0)
提交回复
热议问题