问题
I am develop a Dataflow pipeline which reads a collection of files and, for each line in each file, performs a series of Dataflow blocks.
After all steps have completed for each line in a file, I am wanting to execute further blocks on the file itself, but I don't know how this is possible.
It is straightforward to split processing via a TransformManyBlock
, but how can one then consolidate?
I am used to Apache Camel's Splitter and Aggregator functionality - or there a fundamental disconnect between Dataflow's intent and my desired usage?
回答1:
You probably should look into JoinBlock
and BatchedJoinBlock
. Both of them are able to join two or three sources, and you can setup a filter for them to gather some items specifically.
Some useful links for you:
- How to: Use JoinBlock to Read Data From Multiple Sources
- JoinBlock<T1, T2> Class
- JoinBlock<T1, T2, T3> Class
- BatchedJoinBlock<T1, T2> Class
- BatchedJoinBlock<T1, T2, T3> Class
来源:https://stackoverflow.com/questions/45581714/combining-dataflow-results