How to create a personalised WindowFn in google dataflow

落花浮王杯 提交于 2019-12-08 12:25:15

问题


I'd like to create a different WindowFn in a such way to assign Windows to any of my input elements based on another field instead of based on my input entry's timestamp. I know the pre-defined WindowFn's from Google DataFlow SDK use the timestamp as a criteria to assign window.

More specifically I'd like to create a kind of SlidingWindows but instead of considering timestamp as the Window assignment criteria I'd like to consider another field as that criteria.

How could I create my customised WindowFn? What are the points that I should consider when creating my own WindowFn?

Thanks.


回答1:


To create a new WindowFn, you just need to inherit from WindowFn or a subclass and override the various abstract methods.

In your case, you don't need window merging, so you can inherit from NonMergingWindowFn, and your code could look something like

public class MyWindowFn extends NonMergingWindowFn<ElementT, IntervalWindow> {
  public Collection<W> assignWindows(AssignContext c) {
    return setOfWindowsElementShouldBeIn(c.element());
  }

  public boolean isCompatible(WindowFn other) {
    return other instanceof MyWindowFn;
  }

  public Coder<IntervalWindow> windowCoder() {
    return IntervalWindow.getCoder();
  }

  public W getSideInputWindow(final BoundedWindow window) {
    // You may not need this if you won't ever be using PCollections windowed 
    // with this as side inputs.  If that's the case, just throw.
    // Otherwise you'll need to figure out how to map the main input windows
    // into the windows generated by this WindowFn.
  }
}


来源:https://stackoverflow.com/questions/37897452/how-to-create-a-personalised-windowfn-in-google-dataflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!