flink-streaming | 易学教程

KeyedProcessFunction implementation throwing null pointer exception?

阅读更多关于 KeyedProcessFunction implementation throwing null pointer exception?

来源： https://stackoverflow.com/questions/57245017/keyedprocessfunction-implementation-throwing-null-pointer-exception

Using protobuf with flink

阅读更多关于 Using protobuf with flink

来源： https://stackoverflow.com/questions/38278826/using-protobuf-with-flink

Illegal reflective access by org.apache.flink.api.java.ClosureCleaner

阅读更多关于 Illegal reflective access by org.apache.flink.api.java.ClosureCleaner

问题 When I run a SocketWindowWordCount Program in Apache flink, it shows a WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/home/aman.srivastava/Downloads/flink-1.10.0/lib/flink-dist_2.11-1.10.0.jar) to field java.lang.String.value WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java

I want to write ORC file using Flink's Streaming File Sink but it doesn’t write files correctly

阅读更多关于 I want to write ORC file using Flink's Streaming File Sink but it doesn’t write files correctly

问题 I am reading data from Kafka and trying to write it to the HDFS file system in ORC format. I have used the below link reference from their official website. But I can see that Flink write exact same content for all data and make so many files and all files are ok 103KB https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html#orc-format Please find my code below. object BeaconBatchIngest extends StreamingBase { val env: StreamExecutionEnvironment =

Consume from two flink dataStream based on priority or round robin way

阅读更多关于 Consume from two flink dataStream based on priority or round robin way

问题 I have two flink dataStream . For ex: dataStream1 and dataStream2 . I want to union both the Streams into 1 stream so that I can process them using the same process functions as the dag of both dataStream is the same. As of now, I need equal priority of consumption of messages for either stream. The producer of dataStream2 produces 10 messages per minute, while the producer of dataStream1 produces 1000 messages per second. Also, dataTypes are the same for both dataStreams.DataSteam2 more of a

Apache Flink: Could not extract key from ObjectNode::get

阅读更多关于 Apache Flink: Could not extract key from ObjectNode::get

问题 I'm using Flink to process the data coming from some data source (such as Kafka, Pravega etc). In my case, the data source is Pravega, which provided me a flink connector. My data source is sending me some JSON data as below: {"device":"rand-numeric","id":"b4728895-741f-466a-b87b-79c7590893b4","origin":"1591095418904441036","readings":[{"origin":"1591095418904328442","valueType":"Int64","name":"int","device":"rand-numeric","value":"0"}]} Here is my piece of code: import org.apache.flink

The implementation of the provided ElasticsearchSinkFunction is not serializable(flink-connector-elasticsearch6_2.11)

阅读更多关于 The implementation of the provided ElasticsearchSinkFunction is not serializable(flink-connector-elasticsearch6_2.11)

问题 "non-serializable" error occurs when I follow flink document to write data via flink streaming. I use flink1.6,Elastic-Search-6.4 and flink-connector-elasticsearch6. My code is like @Test public void testStringInsert() throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); env.enableCheckpointing(100); // DataStreamSource<String> input = env.fromCollection(Collections.singleton(

How to implement Group Window Function to a “Over Partition By” on Flink SQL?

阅读更多关于 How to implement Group Window Function to a “Over Partition By” on Flink SQL?

问题 I'm trying to use time windows over Flink SQL, it has been hard for me to get familiar with the framework, but I have already defined my StreamExecutionEnvironment StreamTableEnvironment FlinkKafkaConsumer Then apply query SQL and group by time windows as follows. val stream = env.addSource(new FlinkKafkaConsumer[String]("flink", new SimpleStringSchema(), properties) ) val parsed: DataStream[Order] = stream.map(x=> .... //then I register a DataStream as a table, (Flink Version: 9.3) tEnv

How to store checkpoint into remote RocksDB in Apache Flink

阅读更多关于 How to store checkpoint into remote RocksDB in Apache Flink

问题 I know that there are three kinds of state backends in Apache Flink: MemoryStateBackend, FsStateBackend and RocksDBStateBackend. MemoryStateBackend stores the checkpoints into local RAM, FsStateBackend stores the checkpoints into local FileSystem, and RocksDBStateBackend stores the checkpoints into RocksDB. I have some questions about the RocksDBStateBackend. As my understanding, the mechanism of RocksDBStateBackend has been embedded into Apache Flink. The rocksDB is a kind of key-value DB.

Apache Flink: ProcessWindowFunction KeyBy() multiple values

阅读更多关于 Apache Flink: ProcessWindowFunction KeyBy() multiple values

问题 I'm trying to use WindowFunction with DataStream, my goal is to have a Query like the following SELECT *, count(id) OVER(PARTITION BY country) AS c_country, count(id) OVER(PARTITION BY city) AS c_city, count(id) OVER(PARTITION BY city) AS c_addrs FROM fm ORDER BY country have helped me for the aggregation by the country field, but I need to do the aggregation by two fields in the same time window. I don't know if it is possible to have two or more keys in keyBy( ) for this case val parsed =