flink-streaming

Flink state backend for TaskManager

三世轮回 提交于 2019-12-12 03:33:40
问题 I have a Flink v1.2 setup with 1 JobManager, 2 TaskManagers each in it's own VM. I configured the state backend to filesystem and pointed it to a local location in the case of each of the above hosts (state.backend.fs.checkpointdir: file:///home/ubuntu/Prototype/flink/flink-checkpoints). I have set parallelism to 1 and each taskanager has 1 slot. I then run an event processing job on the JobManager which assigns it to a TaskManager. I kill the TaskManager running the job and after a few

Flink Custom Trigger giving Unexpected Output

白昼怎懂夜的黑 提交于 2019-12-12 02:45:26
问题 I want to create a Trigger which gets fired in 20 seconds for the first time and in every five seconds after that. I have used GlobalWindows and a custom Trigger val windowedStream = valueStream .keyBy(0) .window(GlobalWindows.create()) .trigger(TradeTrigger.of()) Here is the code in TradeTrigger : @PublicEvolving public class TradeTrigger<W extends Window> extends Trigger<Object, W> { private static final long serialVersionUID = 1L; static boolean flag=false; static long ctime = System

Should the entire cluster be restarted if a single Task Manager crashes?

帅比萌擦擦* 提交于 2019-12-12 01:27:15
问题 We're running a standalone Flink cluster with 2 Job Managers and 3 Task Managers. Whenever a TM crashes, we simply restart that particular TM and proceed with the processing. But reading the comments on this question makes it look like we need to restart all the 5 nodes that form a cluster to deal with the failure of a single TM. Am I reading this right? What would be the consequences if we restart just the crashed TM and let the healthy ones run as is? 回答1: Sorry if I my answer elsewhere was

Apache flink - Mini cluster - Windowing operator execution problem

倖福魔咒の 提交于 2019-12-12 01:25:28
问题 This turned out to be the problem for below question Apache flink - job simple windowing problem - java.lang.RuntimeException: segment has been freed - Mini Cluster problem So I wanted to ask by giving specific detail. Adding a very simple windowing operator to job causes below error in MINI CLUSTER ENVIRONMENT: Caused by: java.lang.RuntimeException: segment has been freed at org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitWatermark(RecordWriterOutput.java:123) at org.apache

Distributing data socket among kafka cluster nodes

≯℡__Kan透↙ 提交于 2019-12-11 19:17:31
问题 I want to get data from socket and put it to kafka topic that my flink program can read data from topic and process it. I can do that on one node. But I want to have a kafka cluster with at least three different nodes(different IP address) and poll data from socket to distribute it among nodes.I do not know how to do this and change this code. My simple program is in following: public class WordCount { public static void main(String[] args) throws Exception { kafka_test objKafka=new kafka

Starting Batch process from a stream job

北慕城南 提交于 2019-12-11 18:17:38
问题 Hi I have a maven project for Flink stream processing. Based the message I get from the stream I start a batch process but currently I am getting an error. I am pretty new to this flink world and please let me know if you have any idea. Here is the code I am using to start a standalone cluster. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment ( ); KafkaConsumerService kafkaConsumerService= new KafkaConsumerService(); FlinkKafkaConsumer010<String>

Event time window on kafka source streaming

徘徊边缘 提交于 2019-12-11 17:58:15
问题 There is a topic in Kafka server. In the program, we read this topic as a stream and assign event timestamp. Then do window operation on this stream. But the program doesn't work. After debug, it seems that processWatermark method of WindowOperator is not executed. Here is my code. DataStream<Tuple2<String, Long>> advertisement = env .addSource(new FlinkKafkaConsumer082<String>("advertisement", new SimpleStringSchema(), properties)) .map(new MapFunction<String, Tuple2<String, Long>>() {

Why not on-time when I consumed kafka message using flink streaming sql group by TUMBLE(rowtime)?

这一生的挚爱 提交于 2019-12-11 17:22:25
问题 When I produce 20 messages, only consume 13 messages, the rest 7 not consumed real-time and timely. When some time later, I produce another 20 messages, the rest 7 messages of last time just been consumed. Complete code in location: https://github.com/shaozhipeng/flink-quickstart/blob/master/src/main/java/me/icocoro/quickstart/streaming/sql/KafkaStreamToJDBCTable.java Update different AssignerWithPeriodicWatermarks was not effective. private static final String LOCAL_KAFKA_BROKER = "localhost

Apache Flink add new stream dynamically

安稳与你 提交于 2019-12-11 16:56:37
问题 Is it possible in Apache Flink, to add a new datastream dynamically during runtime without restarting the Job? As far as I understood, a usual Flink program looks like this: val env = StreamExecutionEnvironment.getExecutionEnvironment() val text = env.socketTextStream(hostname, port, "\n") val windowCounts = text.map... env.execute("Socket Window WordCount") In my case it is possible, that e.g. a new device is started and therefore another stream must be processed. But how to add this new

Authenticate with ECE ElasticSearch Sink from Apache Fink (Scala code)

浪尽此生 提交于 2019-12-11 15:32:16
问题 Compiler error when using example provided in Flink documentation. The Flink documentation provides sample Scala code to set the REST client factory parameters when talking to Elasticsearch, https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html. When trying out this code i get a compiler error in IntelliJ which says "Cannot resolve symbol restClientBuilder". I found the following SO which is EXACTLY my problem except that it is in Java and i am doing this in