apache-flink | 易学教程

Flink window function getResult not fired

阅读更多关于 Flink window function getResult not fired

问题 I am trying to use event time in my Flink job, and using BoundedOutOfOrdernessTimestampExtractor to extract timestamp and generate watermark. But I have some input Kafka having sparse stream, it can have no data for a long time, which makes the getResult in AggregateFunction not called at all. I can see data going into add function. I have set getEnv().getConfig().setAutoWatermarkInterval(1000L); I tried eventsWithKey .keyBy(entry -> (String) entry.get(key)) .window(TumblingEventTimeWindows

[flink]Task manager initialization failed

阅读更多关于 [flink]Task manager initialization failed

问题 I am new to flink. I am trying to run the flink example on my local PC(windows). However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0. I checked the log and seems it fails to initialize: 2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed. org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec at org.apache.flink.runtime

[flink]Task manager initialization failed

阅读更多关于 [flink]Task manager initialization failed

Flink: DataSet.count() is bottleneck - How to count parallel?

阅读更多关于 Flink: DataSet.count() is bottleneck - How to count parallel?

问题 I am learning Map-Reduce using Flink and have a question about how to efficiently count elements in a DataSet. What I have so far is this: DataSet<MyClass> ds = ...; long num = ds.count(); When executing this, in my flink log it says 12/03/2016 19:47:27 DataSink (count())(1/1) switched to RUNNING So there is only one CPU used (i have four and other commands like reduce use all of them). I think count() internally collects the DataSet from all four CPUs and counts them sequentially instead of

Flink: DataSet.count() is bottleneck - How to count parallel?

阅读更多关于 Flink: DataSet.count() is bottleneck - How to count parallel?

Manage state with huge memory usage - querying from storage

阅读更多关于 Manage state with huge memory usage - querying from storage

问题 Apologies if this sounds dumb! We are working with flink to make async IO calls. A lot of the times, the IO calls are repeated (same set of parameters) and about 80% of the APIs that we call return the same response for the same parameters. So, we would like to avoid making the calls again. We thought we could use state to store previous responses and use them again. The issue is that though our responses can be used again, the number of such responses is huge and therefore requires a lot of

How do I run Beam Python pipelines using Flink deployed on Kubernetes?

阅读更多关于 How do I run Beam Python pipelines using Flink deployed on Kubernetes?

问题 Does anybody know how to run Beam Python pipelines with Flink when Flink is running as pods in Kubernetes? I have successfully managed to run a Beam Python pipeline using the Portable runner and the job service pointing to a local Flink server running in Docker containers. I was able to achieve that mounting the Docker socket in my Flink containers, and running Flink as root process, so the class DockerEnvironmentFactory can create the Python harness container. Unfortunately, I can't use the

How do I run Beam Python pipelines using Flink deployed on Kubernetes?

阅读更多关于 How do I run Beam Python pipelines using Flink deployed on Kubernetes?

Dynamic flink window creation by reading the details from kafka

阅读更多关于 Dynamic flink window creation by reading the details from kafka

问题 Let say Kafka message contain flink window size configuration. I want read the message from kafka and create global window in flink. Problem Statement: Can we handle above scenario by using BroadcastStream ? Or Any other approach which will support above case ? 回答1: Flink's window API does not support dynamically changing window sizes. What you'll need to do is to implement your own windowing using a process function. In this case a KeyedBroadcastProcessFunction, where the window

Dynamic flink window creation by reading the details from kafka

阅读更多关于 Dynamic flink window creation by reading the details from kafka