flink-streaming | 易学教程

Having an equivalent to HOP_START inside an aggregation primitive in Flink

阅读更多关于 Having an equivalent to HOP_START inside an aggregation primitive in Flink

问题 I'm trying to do an exponentially decaying moving average over a hopping window in Flink SQL. I need the have access to one of the borders of the window, the HOP_START in the following: SELECT lb_index one_key, -- I have access to this one: HOP_START(proctime, INTERVAL '0.05' SECOND, INTERVAL '5' SECOND) start_time, -- Aggregation primitive: SUM( Y * EXP(TIMESTAMPDIFF( SECOND, proctime, -- This one throws: HOP_START(proctime, INTERVAL '0.05' SECOND, INTERVAL '5' SECOND) ))) FROM write

Flink Complex Event Processing

阅读更多关于 Flink Complex Event Processing

问题 I have a flink cep code that reads from socket and detects for a pattern. Lets say the pattern(word) is 'alert'. If the word alert occurs five times or more, an alert should be created. But I am getting an input mismatch error. Flink version is 1.3.0. Thanks in advance !! package pattern; import org.apache.flink.cep.CEP; import org.apache.flink.cep.PatternStream; import org.apache.flink.cep.pattern.Pattern; import org.apache.flink.cep.pattern.conditions.IterativeCondition; import org.apache

Does Flink SQL support Java Map types?

阅读更多关于 Does Flink SQL support Java Map types?

问题 I'm trying to access a key from a map using Flink's SQL API. It fails with the error Exception in thread "main" org.apache.flink.table.api.TableException: Type is not supported: ANY Please advise how i can fix it. Here is my event class public class EventHolder { private Map<String,String> event; public Map<String, String> getEvent() { return event; } public void setEvent(Map<String, String> event) { this.event = event; } } Here is the main class which submits the flink job public class

Flink Trigger when State expires

阅读更多关于 Flink Trigger when State expires

问题 I have an interesting use case which I want to test with Flink. I have an incoming stream of Message which is either PASS or FAIL . Now if the message is of type FAIL , I have a downstream ProcessFunction which saves the Message state and then sends pause commands to everything that depends on this. When I receive a PASS message which is associated with the FAIL I had received earlier (keying by message id), I send resume commands to everything I had paused earlier. Now I plan on using State

Apache Flink 1.4.2 akka.actor.ActorNotFound

阅读更多关于 Apache Flink 1.4.2 akka.actor.ActorNotFound

问题 After upgrading to Apache Flink 1.4.2 we get following errors every few seconds on one TaskManager out of 3. 2018-06-27 17:33:46.632 [jobmanager-future-thread-2] DEBUG o.a.flink.runtime.rest.handler.legacy.metrics.MetricFetcher - Could not retrieve QueryServiceGateway. java.util.concurrent.CompletionException: akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://flink@tm03-dev:6124/), Path(/user/MetricQueryService_64bde0e9e6f3f0a906a30e88c261c9d7)] at java.util

Deploy stream processing topology on runtime?

阅读更多关于 Deploy stream processing topology on runtime?

问题 H all, I have a requirement where in I need to re-ingest some of my older data. We have a multi staged pipeline , the source of which is a Kafka topic. Once a record is fed into that, it runs through a series of steps(about 10). Each step massages the original JSON object pushed to the source topic and pushes to a destination topic. Now, sometimes, we need to re ingest the older data and apply a subset of the steps I described above. We intend to push these re-ingest records to a different

Why doesn't the Flink SocketTextStreamWordCount work?

阅读更多关于 Why doesn't the Flink SocketTextStreamWordCount work?

问题 I've set up the example project and built it. I'm able to run the WordCount program as expected. But when I run the SocketTextWordCount, I'm not getting any results printed out. I send data in through nc (localhost:9999 on both sides) In the web console for the running job, I can see that there are messages being sent/received But I never see the counts.print() output printed out anywhere, even after killing the nc session. EDIT - when I change it around to print results to a text file, no

Effect of increasing parallelism on throughput

阅读更多关于 Effect of increasing parallelism on throughput

问题 I ran a job first with Parallelism 1 and then with Parallelism 3. With Parallelism=1, the kafka source was reading records at rate ~500 records per second. With Parallelism=3, the throughput got divided among the three parallelisms, each reading approximately ~150 records per second. Note that the source is publishing records at a much higher rate (~1000 records per second). Is this expected? I would imagine the throughput to increase with parallelism, but it is remaining the same. I checked

How to let Flink flush last line to sink when producer(kafka) does not produce new line

阅读更多关于 How to let Flink flush last line to sink when producer(kafka) does not produce new line

问题 when my Flink program is in event time mode, sink will not get last line(say line A). If I feed new line(line B) to Flink, I will get the line A, but I still cann't get the line b. val env = StreamExecutionEnvironment.getExecutionEnvironment env.setParallelism(1) env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE) env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) val properties = new Properties() properties.setProperty("bootstrap.servers", "localhost:9092") properties

Flink session window with onEventTime trigger?

阅读更多关于 Flink session window with onEventTime trigger?

问题 I want to create an EventTime based session-window in Flink, such that it triggers when the event time of a new message is more than 180 seconds greater than the event time of the message, that created the window. For example: t1(0 seconds) : msg1 <-- This is the first message which causes the session-windows to be created t2(13 seconds) : msg2 t3(39 seconds) : msg3 . . . . t7(190 seconds) : msg7 <-- The event time (t7) is more than 180 seconds than t1 (t7 - t1 = 190), so the window should be