TPL architectural question

后端 未结 2 1053
甜味超标
甜味超标 2021-02-06 09:48

I\'m currently working on a project, where we have the challenge to process items in parallel. So far not a big deal ;) Now to the problem. We have a list of IDs, where we perio

2条回答
  •  梦谈多话
    2021-02-06 10:45

    I don't think you actually need to get down and dirty with direct TPL Tasks for this. For starters I would set up a BlockingCollection around a ConcurrentQueue (the default) with no BoundedCapacity set on the BlockingCollection to store the IDs that need to be processed.

    // Setup the blocking collection somewhere when your process starts up (OnStart for a Windows service)
    BlockingCollection idsToProcess = new BlockingCollection();
    

    From there I would just use Parallel::ForEach on the enumeration returned from the BlockingCollection::GetConsumingEnumerable. In the ForEach call you will setup your ParallelOptions::MaxDegreeOfParallelism Inside the body of the ForEach you will execute your stored procedure.

    Now, once the stored procedure execution completes, you're saying you don't want to re-schedule the execution for at least two seconds. No problem, schedule a System.Threading.Timer with a callback which will simply add the ID back to the BlockingCollection in the supplied callback.

    Parallel.ForEach(
        idsToProcess.GetConsumingEnumerable(),
        new ParallelOptions 
        { 
            MaxDegreeOfParallelism = 4 // read this from config
        },
        (id) =>
        {
           // ... execute sproc ...
    
           // Need to declare/assign this before the delegate so that we can dispose of it inside 
           Timer timer = null;
    
           timer = new Timer(
               _ =>
               {
                   // Add the id back to the collection so it will be processed again
                   idsToProcess.Add(id);
    
                   // Cleanup the timer
                   timer.Dispose();
               },
               null, // no state, id wee need is "captured" in the anonymous delegate
               2000, // probably should read this from config
               Timeout.Infinite);
        }
    

    Finally, when the process is shutting down you would call BlockingCollection::CompleteAdding so that the enumerable being processed with stop blocking and complete and the Parallel::ForEach will exit. If this were a Windows service for example you would do this in OnStop.

    // When ready to shutdown you just signal you're done adding
    idsToProcess.CompleteAdding();
    

    Update

    You raised a valid concern in your comment that you might be processing a large amount of IDs at any given point and fear that there would be too much overhead in a timer per ID. I would absolutely agree with that. So in the case that you are dealing with a large list of IDs concurrently, I would change from using a timer-per-ID to using another queue to hold the "sleeping" IDs which is monitored by a single short interval timer instead. First you'll need a ConcurrentQueue onto which to place the IDs that are asleep:

    ConcurrentQueue> sleepingIds = new ConcurrentQueue>();
    

    Now, I'm using a two-part Tuple here for illustration purposes, but you may want to create a more strongly typed struct for it (or at least alias it with a using statement) for better readability. The tuple has the id and a DateTime which represents when it was put on the queue.

    Now you'll also want to setup the timer that will monitor this queue:

    Timer wakeSleepingIdsTimer = new Timer(
       _ =>
       {
           DateTime utcNow = DateTime.UtcNow;
    
           // Pull all items from the sleeping queue that have been there for at least 2 seconds
           foreach(string id in sleepingIds.TakeWhile(entry => (utcNow - entry.Item2).TotalSeconds >= 2))
           {
               // Add this id back to the processing queue
               idsToProcess.Enqueue(id);
           }
       },
       null, // no state
       Timeout.Infinite, // no due time
       100 // wake up every 100ms, probably should read this from config
     );
    

    Then you would simply change the Parallel::ForEach to do the following instead of setting up a timer for each one:

    (id) =>
    {
           // ... execute sproc ...
    
           sleepingIds.Enqueue(Tuple.Create(id, DateTime.UtcNow)); 
    }
    

提交回复
热议问题