How to check DataStream in flink is empty or having data

╄→гoц情女王★ 提交于 2021-01-29 10:33:01

问题


I am new to Apache flink i have a datastream which implements a process function if certain conditions is met then the datastream is valid and if its not meeting the conditions i am writing it to sideoutput. I am able to print the datastream is it possible to check the datastream is empty or null.I tried using datastream.equals(null) method but its not working.Please suggest how to know whether a datastream is empty or not


回答1:


By "empty", I assume you mean that no data is flowing. What are you hoping to do in this case?

Flink doesn't have a well-defined notion of an "empty" stream. Streams are always connected to one or more sources, which can be bounded or unbounded. Bounded sources (like files and collections) eventually terminate by reaching their end (at which point they emit a watermark with the value MAX_WATERMARK, which you could watch for with a timer), but in general there is no way of knowing whether an unbounded source (e.g., a Kafka topic) might produce any (more) data.

There are, however, metrics you can observe, such as NumRecordsOut, or NumRecordsOutPerSecond, that will tell you if any output is being produced. Or your process function could collect information about its behavior and report this on a side output (rather like what you are already doing).



来源:https://stackoverflow.com/questions/61889706/how-to-check-datastream-in-flink-is-empty-or-having-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!