Windowing with Apache Beam - Fixed Windows Don't Seem to be Closing?

前端 未结 1 1308
慢半拍i
慢半拍i 2020-12-31 13:14

We are attempting to use fixed windows on an Apache Beam pipeline (using DirectRunner). Our flow is as follows:

  1. Pull data from pub/sub
  2. De
1条回答
  •  隐瞒了意图╮
    2020-12-31 13:47

    Looks like the main issue was indeed a missing trigger - the window was opening and there was nothing telling it when to emit results. We wanted to simply window based on processing time (not event time) and so did the following:

    .apply("Window", Window
        .into(new GlobalWindows())
        .triggering(Repeatedly
            .forever(AfterProcessingTime
                .pastFirstElementInPane()
                .plusDelayOf(Duration.standardSeconds(5))
            )
        )
        .withAllowedLateness(Duration.ZERO).discardingFiredPanes()
    )
    

    Essentially this creates a global window, which is triggered to emit events 5 seconds after the first element is processed. Every time the window is closed, another is opened once it receives an element. Beam complained when we didn't have the withAllowedLateness piece - as far as I know this just tells it to ignore any late data.

    My understanding may be a bit off the mark here, but the above snippet has solved our problem!

    0 讨论(0)
提交回复
热议问题