apache-flink | 易学教程

I want to write ORC file using Flink's Streaming File Sink but it doesn’t write files correctly

阅读更多关于 I want to write ORC file using Flink's Streaming File Sink but it doesn’t write files correctly

问题 I am reading data from Kafka and trying to write it to the HDFS file system in ORC format. I have used the below link reference from their official website. But I can see that Flink write exact same content for all data and make so many files and all files are ok 103KB https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html#orc-format Please find my code below. object BeaconBatchIngest extends StreamingBase { val env: StreamExecutionEnvironment =

Flink stream not finishing

阅读更多关于 Flink stream not finishing

问题 I am settings up a flink stream processor using kafka and elasticsearch. I want to replay my data, but when i set the parallelism to more than 1, It does not finish the program I believe this to be because that only one message is seen by the kafka stream to be identified as the end of the stream. public CustomSchema(Date _endTime) { endTime = _endTime; } @Override public boolean isEndOfStream(CustomTopicWrapper nextElement) { if (this.endTime != null && nextElement.messageTime.getTime() >=

Flink stream not finishing

阅读更多关于 Flink stream not finishing

Consume from two flink dataStream based on priority or round robin way

阅读更多关于 Consume from two flink dataStream based on priority or round robin way

问题 I have two flink dataStream . For ex: dataStream1 and dataStream2 . I want to union both the Streams into 1 stream so that I can process them using the same process functions as the dag of both dataStream is the same. As of now, I need equal priority of consumption of messages for either stream. The producer of dataStream2 produces 10 messages per minute, while the producer of dataStream1 produces 1000 messages per second. Also, dataTypes are the same for both dataStreams.DataSteam2 more of a

How do I join two streams in apache flink?

阅读更多关于 How do I join two streams in apache flink?

问题 I am getting started with flink and having a look at one of the official tutorials. To my understanding the goal of this exercise is to join the two streams on the time attribute. Task: The result of this exercise is a data stream of Tuple2 records, one for each distinct rideId. You should ignore the END events, and only join the event for the START of each ride with its corresponding fare data. The resulting stream should be printed to standard out. Question: How is the EnrichmentFunction

How do I join two streams in apache flink?

阅读更多关于 How do I join two streams in apache flink?

Apache Flink: Cannot find compatible factory for specified execution.target (=local)

阅读更多关于 Apache Flink: Cannot find compatible factory for specified execution.target (=local)

问题 I've decided to experiment with apache flink a bit. I decided to use scala console (or more precisely http://ammonite.io/) to read some stuff from csv file and print it locally... just to debug end experiments. import $ivy.`org.apache.flink:flink-csv:1.10.0` import $ivy.`org.apache.flink::flink-scala:1.10.0` import org.apache.flink.api.scala._ import org.apache.flink.api.scala.extensions._ val env = ExecutionEnvironment.createLocalEnvironment() val lines = env.readCsvFile[(String, String,

Apache Flink: Cannot find compatible factory for specified execution.target (=local)

阅读更多关于 Apache Flink: Cannot find compatible factory for specified execution.target (=local)

Simple hello world example for Flink

阅读更多关于 Simple hello world example for Flink

问题 I am looking for the simplest possible example of an hello-world experience with Apache flink. Assume I have just installed flink on a clean box, what is the bare minimum I would need to do to 'make it do something'. I realize this is quite vague, here are some examples. Three python examples from the terminal: python -c "print('hello world')" python hello_world.py python python -c "print(1+1)" Of course a streaming application is a bit more complicated, but here is something similar that I

Apache Flink: Could not extract key from ObjectNode::get

阅读更多关于 Apache Flink: Could not extract key from ObjectNode::get

问题 I'm using Flink to process the data coming from some data source (such as Kafka, Pravega etc). In my case, the data source is Pravega, which provided me a flink connector. My data source is sending me some JSON data as below: {"device":"rand-numeric","id":"b4728895-741f-466a-b87b-79c7590893b4","origin":"1591095418904441036","readings":[{"origin":"1591095418904328442","valueType":"Int64","name":"int","device":"rand-numeric","value":"0"}]} Here is my piece of code: import org.apache.flink