问题
As the title says, I have to following function:
public async IAsyncEnumerable<Job> GetByPipeline(int pipelineId,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
await foreach (var job in context.Jobs.Where(job => job.Pipeline.Id == pipelineId)
.AsAsyncEnumerable()
.WithCancellation(cancellationToken)
.ConfigureAwait(false))
{
yield return job;
}
}
I have trouble wrapping my head around where the cancellation token is going, and a nagging feeling that I am using it in too many places.
What is actually happening here when you deconstruct all the fancy async stuff? And are there any better ways to write this function?
回答1:
For starters, this method could be reduced to :
public IAsyncEnumerable<Job> GetByPipeline(int pipelineId)
{
return context.Jobs
.Where(job => job.Pipeline.Id == pipelineId)
.AsAsyncEnumerable();
}
or even
public IAsyncEnumerable<Job> GetByPipeline(int pipelineId)
=> context.Jobs
.Where(job => job.Pipeline.Id == pipelineId)
.AsAsyncEnumerable();
The method doesn't do anything with job
so it doesn't need to iterate over it.
Cancellation
What if the method actually used job
, where should the cancellation token be used?
Let's clean up the method a bit. The equivalent is:
public async IAsyncEnumerable<Job> GetByPipeline(
int pipelineId,
[EnumeratorCancellation] CancellationToken ct = default)
{
//Just a query, doesn't execute anything
var query =context.Jobs.Where(job => job.Pipeline.Id == pipelineId);
//Executes the query and returns the *results* as soon as they arrive in an async stream
var jobStream=query.AsAsyncEnumerable();
//Process the results from the async stream as they arrive
await foreach (var job in jobStream.WithCancellation(ct).ConfigureAwait(false))
{
//Does *that* need cancelling?
DoSometingExpensive(job);
}
}
The IQueryable query
doesn't run anything, it represents the query. It doesn't need cancellation.
AsAsyncEnumerable()
, AsEnumerable()
, ToList()
etc execute the query and return some result. ToList()
etc consume all the results while the As...Enumerable()
methods produce results only when requested. The query can't be cancelled, the As_Enumerable()
methods won't return anything unless asked for it, so they don't need cancellation.
await foreach
will iterate over the entire async stream so if we want to be able to abort it, we do need to pass the cancellation token.
Finally, does DoSometingExpensive(job);
need cancellation? Is it so expensive that we want to be able to break out of it if it takes too long? Or can we wait until it's finished before exiting the loop? If it needs cancellation, it will need the CancellationToken too.
ConfigureAwait
Finally, ConfigureAwait(false)
isn't involved in cancellation, and may not be needed at all. Without it, after each await
execution returns to the original synchronization context. In a desktop application, this meant the UI thread. That's what allows us to modify the UI in an async event handler.
If GetByPipeline
runs on a desktop app and wanted to modify the UI, it would have to remove ConfugureAwait
:
await foreach (var job in jobStream.WithCancellation(ct))
{
//Update the UI
toolStripProgressBar.Increment(1);
toolStripStatusLabel.Text=job.Name;
//Do the actual job
DoSometingExpensive(job);
}
With ConfigureAwait(false)
, execution continues on a threadpool thread and we can't touch the UI.
Library code shouldn't affect how execution resumes, so most libraries use ConfigureAwait(false)
and leave the final decision to the UI developer.
If GetByPipeline
is a library method, do use ConfigureAwait(false)
.
回答2:
Imagine that somewhere deep inside the Entity Framework is the method GetJobs
that retrieves the Job
objects form the database:
private static async IAsyncEnumerable<Job> GetJobs(DbDataReader dataReader,
[EnumeratorCancellation]CancellationToken cancellationToken = default)
{
while (await dataReader.ReadAsync(cancellationToken))
{
yield return new Job()
{
Id = (int)dataReader["Id"],
Data = (byte[])dataReader["Data"]
};
}
}
Now imagine that the Data
property contains a huge byte array with data accosiated with the Job
. Retrieving the array of each Job
may take some non-trivial amount of time. In this case breaking the loop between iterations would not be enough, because there would be a noticable delay between invoking the Cancel
method and the raising of the OperationCanceledException
. This is why the method DbDataReader.ReadAsync needs a CancellationToken
, so that the query can be canceled instantly.
The challenge now is how to pass the CancellationToken
passed by the client code to the GetJobs
method, when a property like context.Jobs
is along the way. The solution is the WithCancellation extension method, that stores the token and passes it deeper, to a method accepting an argument decorated with the EnumeratorCancellation attribute.
So in your case you have done everything correctly. You have included a cancellationToken
argument in your IAsyncEnumerable
returning method, which is the recommended practice. This way subsequent WithCancellation
chained to your GetByPipeline
method will not be wasted. Then you chained the WithCancellation
after the AsAsyncEnumerable inside your method, which is also correct. Otherwise the CancellationToken
would not reach its final destination, the GetJobs
method.
来源:https://stackoverflow.com/questions/58757843/iterating-an-iasyncenumerable-in-a-function-returning-an-iasyncenumerable-with-c