flink-streaming

Can we combine both and count and process time Trigger in Flink?

99封情书 提交于 2020-01-24 20:27:11
问题 I want to make the Windows completed after the count reached 100 or every 5 seconds for the tumbling process time ? That is to say when the elements reached 100, trigger the Windows computation, however if the elements don't reache 100, but the time elapsed 5 seconds, it also trigger the Windows computation, just as the combination of the below two triggers: .countWindow(100) .window(TumblingProcessingTimeWindows.of(Time.seconds(5))) 回答1: There's no super simple way to do this with the

What does it mean that “broadcast state” unblocks the implementation of the “dynamic patterns” feature for Flink’s CEP library?

烂漫一生 提交于 2020-01-14 06:00:28
问题 From the Flink 1.5 release announcement, we know Flink now supports "broadcast state", and it was described that "broadcast state unblocks the implementation of the “dynamic patterns” feature for Flink’s CEP library.". Does it means currently we can use "broadcast state" to implement the “dynamic patterns” without Flink CEP ? Also I have no idea what's the difference when implementing the “dynamic patterns” for Flink CEP with or without broadcast state? I would appreciate If someone can give

Flink streaming: how to control the execution time

本小妞迷上赌 提交于 2020-01-06 05:55:40
问题 Spark streaming provides API for termination awaitTermination(). Is there any similar API available to gracefully shut down flink streaming after some t seconds? 回答1: Your driver program (i.e. the main method) in Flink doesn't stay running while the streaming job executes. Your program should define a dataflow, call execute , and then terminate. In Spark, the driver program stays running (AFAIK), and awaitTermination relates to that. Note that a Flink streaming dataflow continues to execute

Flink streaming: how to control the execution time

谁说我不能喝 提交于 2020-01-06 05:54:40
问题 Spark streaming provides API for termination awaitTermination(). Is there any similar API available to gracefully shut down flink streaming after some t seconds? 回答1: Your driver program (i.e. the main method) in Flink doesn't stay running while the streaming job executes. Your program should define a dataflow, call execute , and then terminate. In Spark, the driver program stays running (AFAIK), and awaitTermination relates to that. Note that a Flink streaming dataflow continues to execute

Flink 1.7.0 Dashboard not show Task Statistics

两盒软妹~` 提交于 2020-01-05 07:18:09
问题 I use Flink 1.7 dashboard and select a streaming job. This should show me some metrics, but it remains to load. I deployed the same job in a Flink 1.5 cluster, and I can watch the metrics. Flink is running in docker swarm, but if I run Flink 1.7 in docker-compose (not in the swarm), it works I can do it work, deleting the hostname in docker-compose.yaml file version: "3" services: jobmanager17: image: flink:1.7.0-hadoop27-scala_2.11 hostname: "{{.Node.Hostname}}" ports: - "8081:8081" - "9254

How to specify log file different from daemon log file while submitting a flink job in a standalone flink cluster

十年热恋 提交于 2020-01-05 05:10:28
问题 When I am starting a flink standalone cluster, It logs daemon logs in a file mentioned in conf/log4j.properties file, and when I submit a flink job in that cluster, it uses same properties file to log the application logs and write into same log file on taskmanagers. I want to have separate log files for my each application submitted in that flink standalone cluster. Is there any way to achieve that 回答1: When you submit the job using the ./bin/flink shell script, use the following environment

What happen to state in Flink Task Manager when crash?

懵懂的女人 提交于 2020-01-03 05:23:13
问题 may i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing? 回答1: Flink does not (yet) support dynamic rescaling of state, so the failed task manager must be recovered, and the job will be restarted from a checkpoint. Exactly what that involves depends on how your cluster is configured, and whether the job failed

Flink: possible to delete Queryable state after X time?

廉价感情. 提交于 2020-01-03 02:56:07
问题 In my case, I use Flink's queryable state only. In particular, I do not care about checkpoints. Upon an event, I query the queryable state only after a maximum of X minutes. Ideally, I would delete the "old" state to save on space. That's why I wonder: can I signal Flink's state to clear itself after some time? Through configuration? Through specific event signals? How? 回答1: One way to clear state is to explicitly call clear() on the state object (e.g., a ValueState object) when you no longer

Fllink Web UI not displaying records received in a Custom Source implementation

眉间皱痕 提交于 2020-01-02 05:23:14
问题 I have build a custom source to process a log stream in Flink. The program is running fine and giving me the desired results after processing the records. But, when I check the Web UI, I do not see the counts. Below is the screenshot: Records/Bytes Count 回答1: Flink chained all the operators of your pipeline into one operator: Source -> FlatMap -> ProcessLog -> Sink . Thus, this single operator contains the source and the sink. Additionally, Flink can neither measure the amount of bytes read

How to sort an out-of-order event time stream using Flink

倾然丶 夕夏残阳落幕 提交于 2019-12-31 03:07:06
问题 This question covers how to sort an out-of-order stream using Flink SQL, but I would rather use the DataStream API. One solution is to do this with a ProcessFunction that uses a PriorityQueue to buffer events until the watermark indicates they are no longer out-of-order, but this performs poorly with the RocksDB state backend (the problem is that each access to the PriorityQueue will require ser/de of the entire PriorityQueue). How can I do this efficiently regardless of which state backend