问题
I am new to Apache flink i have a datastream which implements a process function if certain conditions is met then the datastream is valid and if its not meeting the conditions i am writing it to sideoutput. I am able to print the datastream is it possible to check the datastream is empty or null.I tried using datastream.equals(null)
method but its not working.Please suggest how to know whether a datastream is empty or not
回答1:
By "empty", I assume you mean that no data is flowing. What are you hoping to do in this case?
Flink doesn't have a well-defined notion of an "empty" stream. Streams are always connected to one or more sources, which can be bounded or unbounded. Bounded sources (like files and collections) eventually terminate by reaching their end (at which point they emit a watermark with the value MAX_WATERMARK, which you could watch for with a timer), but in general there is no way of knowing whether an unbounded source (e.g., a Kafka topic) might produce any (more) data.
There are, however, metrics you can observe, such as NumRecordsOut, or NumRecordsOutPerSecond, that will tell you if any output is being produced. Or your process function could collect information about its behavior and report this on a side output (rather like what you are already doing).
来源:https://stackoverflow.com/questions/61889706/how-to-check-datastream-in-flink-is-empty-or-having-data