问题
I am trying to create a JUnit test for a Flink streaming job which writes data to a kafka topic and read data from the same kafka topic using FlinkKafkaProducer09
and FlinkKafkaConsumer09
respectively. I am passing a test data in the produce:
DataStream<String> stream = env.fromElements("tom", "jerry", "bill");
And checking whether same data is coming from the consumer as:
List<String> expected = Arrays.asList("tom", "jerry", "bill");
List<String> result = resultSink.getResult();
assertEquals(expected, result);
using TestListResultSink
.
I am able to see the data coming from the consumer as expected by printing the stream. But could not get the Junit test result as the consumer will keep on running even after the message finished. So it did not come to test part.
Is thre any way in Flink
or FlinkKafkaConsumer09
to stop the process or to run for specific time?
回答1:
The underlying problem is that streaming programs are usually not finite and run indefinitely.
The best way, at least for the moment, is to insert a special control message into your stream which lets the source properly terminate (simply stop reading more data by leaving the reading loop). That way Flink will tell all down-stream operators that they can stop after they have consumed all data.
Alternatively, you can throw a special exception in your source (e.g. after some time) such that you can distinguish a "proper" termination from a failure case (by checking the error cause). Throwing an exception in the source will fail the program.
回答2:
Can you not use isEndOfStream override within the Deserializer to stop fetching from Kafka? If I read correctly, the flink/Kafka09Fetcher has the following code in its run method which breaks the event loop
if (deserializer.isEndOfStream(value)) {
// end of stream signaled
running = false;
break;
}
My thought was to use Till Rohrmann's idea of a control message in conjunction with this isEndOfStream method to tell the KafkaConsumer to stop reading.
Any reason that will not work? Or maybe some corner cases I'm overlooking?
https://github.com/apache/flink/blob/07de86559d64f375d4a2df46d320fc0f5791b562/flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka09Fetcher.java#L146
回答3:
Following @TillRohrman
You can combine the special exception method and handle it in your unit test if you use an EmbeddedKafka instance, and then read off the EmbeddedKafka topic and assert the consumer values.
I found https://github.com/asmaier/mini-kafka/blob/master/src/test/java/de/am/KafkaProducerIT.java to be extremely useful in this regard.
The only problem is that you will lose the element that triggers the exception but you can always adjust your test data to account for that.
来源:https://stackoverflow.com/questions/44441153/how-to-stop-a-flink-streaming-job-from-program