问题
I would like to emit a single pane when the watermark reaches x minutes past the end of the window. This let's me ensure I handle some late data, but still only emit one pane. I am currently working in java.
At the moment I can't find proper solutions to this problem. I could emit a single pane when the watermark reaches the end of the window, but then any late data is dropped. I could emit the pane at the end of the window and then again when I receive late data, however in this case I am not emitting a single pane.
I currently have code similar to this:
.triggering(
// This is going to emit the pane, but I don't want emit the pane yet!
AfterWatermark.pastEndOfWindow()
// This is going to emit panes each time I receive late data, however
// I would like to only emit one pane at the end of the allowedLateness
).withAllowedLateness(allowedLateness).accumulatingFiredPanes())
In case there is still confusion, I would like to only emit a single pane when the watermark passes the allowedLateness
.
回答1:
What I would do is, first, to set Window.ClosingBehavior to FIRE_ALWAYS
. This way, when the window is permanently closed it will send a final pane (even if there are no late records since the last pane) with PaneInfo.isLast set to true
.
Then, I would proceed with the second option:
I could emit the pane at the end of the window and then again when I receive late data, however in this case I am not emitting a single pane.
But discarding downstream the panes that are not final with something like:
public void processElement(ProcessContext c) {
if (c.pane().isLast) {
c.output(c.element());
}
}
回答2:
Thanks Guillem, in the end I used your answer to find this very useful link with lots of apache beam examples. From this I came up with the following solution:
// We first specify to never emit any panes
.triggering(Never.ever())
// We then specify to fire always when closing the window. This will emit a
// single final pane at the end of allowedLateness
.withAllowedLateness(allowedLateness, Window.ClosingBehavior.FIRE_ALWAYS)
.discardingFiredPanes())
来源:https://stackoverflow.com/questions/55954482/apache-beam-windowing-consider-late-data-but-emit-only-one-pane