Beam/Dataflow 2.2.0 - extract first n elements from pcollection

我只是一个虾纸丫 提交于 2019-12-08 14:30:23

问题


Is there any way to extract first n elements in a beam pcollection? The documentation doesn't seem to indicate any such function. I think such an operation would require first a global element number assignment and then a filter - would be nice to have this functionality.

I use Google DataFlow Java SDK 2.2.0.


回答1:


PCollection's are unordered per se, so the notion of "first N elements" does not exist - however:

  • In case you need the top N elements by some criterion, you can use the Top transform.

  • In case you need any N elements, you can use Sample.



来源:https://stackoverflow.com/questions/48267159/beam-dataflow-2-2-0-extract-first-n-elements-from-pcollection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!