flink-streaming

Illegal reflective access by org.apache.flink.api.java.ClosureCleaner

杀马特。学长 韩版系。学妹 提交于 2020-08-08 06:25:22
问题 When I run a SocketWindowWordCount Program in Apache flink, it shows a WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/home/aman.srivastava/Downloads/flink-1.10.0/lib/flink-dist_2.11-1.10.0.jar) to field java.lang.String.value WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java

I want to write ORC file using Flink's Streaming File Sink but it doesn’t write files correctly

谁说我不能喝 提交于 2020-07-20 03:48:09
问题 I am reading data from Kafka and trying to write it to the HDFS file system in ORC format. I have used the below link reference from their official website. But I can see that Flink write exact same content for all data and make so many files and all files are ok 103KB https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html#orc-format Please find my code below. object BeaconBatchIngest extends StreamingBase { val env: StreamExecutionEnvironment =

Consume from two flink dataStream based on priority or round robin way

怎甘沉沦 提交于 2020-06-28 08:39:49
问题 I have two flink dataStream . For ex: dataStream1 and dataStream2 . I want to union both the Streams into 1 stream so that I can process them using the same process functions as the dag of both dataStream is the same. As of now, I need equal priority of consumption of messages for either stream. The producer of dataStream2 produces 10 messages per minute, while the producer of dataStream1 produces 1000 messages per second. Also, dataTypes are the same for both dataStreams.DataSteam2 more of a

Apache Flink: Could not extract key from ObjectNode::get

给你一囗甜甜゛ 提交于 2020-06-17 15:50:26
问题 I'm using Flink to process the data coming from some data source (such as Kafka, Pravega etc). In my case, the data source is Pravega, which provided me a flink connector. My data source is sending me some JSON data as below: {"device":"rand-numeric","id":"b4728895-741f-466a-b87b-79c7590893b4","origin":"1591095418904441036","readings":[{"origin":"1591095418904328442","valueType":"Int64","name":"int","device":"rand-numeric","value":"0"}]} Here is my piece of code: import org.apache.flink

The implementation of the provided ElasticsearchSinkFunction is not serializable(flink-connector-elasticsearch6_2.11)

◇◆丶佛笑我妖孽 提交于 2020-06-17 14:08:20
问题 "non-serializable" error occurs when I follow flink document to write data via flink streaming. I use flink1.6,Elastic-Search-6.4 and flink-connector-elasticsearch6. My code is like @Test public void testStringInsert() throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); env.enableCheckpointing(100); // DataStreamSource<String> input = env.fromCollection(Collections.singleton(

How to implement Group Window Function to a “Over Partition By” on Flink SQL?

别等时光非礼了梦想. 提交于 2020-06-17 13:25:10
问题 I'm trying to use time windows over Flink SQL, it has been hard for me to get familiar with the framework, but I have already defined my StreamExecutionEnvironment StreamTableEnvironment FlinkKafkaConsumer Then apply query SQL and group by time windows as follows. val stream = env.addSource(new FlinkKafkaConsumer[String]("flink", new SimpleStringSchema(), properties) ) val parsed: DataStream[Order] = stream.map(x=> .... //then I register a DataStream as a table, (Flink Version: 9.3) tEnv

How to store checkpoint into remote RocksDB in Apache Flink

人走茶凉 提交于 2020-06-17 09:07:07
问题 I know that there are three kinds of state backends in Apache Flink: MemoryStateBackend, FsStateBackend and RocksDBStateBackend. MemoryStateBackend stores the checkpoints into local RAM, FsStateBackend stores the checkpoints into local FileSystem, and RocksDBStateBackend stores the checkpoints into RocksDB. I have some questions about the RocksDBStateBackend. As my understanding, the mechanism of RocksDBStateBackend has been embedded into Apache Flink. The rocksDB is a kind of key-value DB.

Apache Flink: ProcessWindowFunction KeyBy() multiple values

荒凉一梦 提交于 2020-06-07 06:08:26
问题 I'm trying to use WindowFunction with DataStream, my goal is to have a Query like the following SELECT *, count(id) OVER(PARTITION BY country) AS c_country, count(id) OVER(PARTITION BY city) AS c_city, count(id) OVER(PARTITION BY city) AS c_addrs FROM fm ORDER BY country have helped me for the aggregation by the country field, but I need to do the aggregation by two fields in the same time window. I don't know if it is possible to have two or more keys in keyBy( ) for this case val parsed =