apache-flink

The implementation of the provided ElasticsearchSinkFunction is not serializable(flink-connector-elasticsearch6_2.11)

◇◆丶佛笑我妖孽 提交于 2020-06-17 14:08:20
问题 "non-serializable" error occurs when I follow flink document to write data via flink streaming. I use flink1.6,Elastic-Search-6.4 and flink-connector-elasticsearch6. My code is like @Test public void testStringInsert() throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); env.enableCheckpointing(100); // DataStreamSource<String> input = env.fromCollection(Collections.singleton(

How to implement Group Window Function to a “Over Partition By” on Flink SQL?

别等时光非礼了梦想. 提交于 2020-06-17 13:25:10
问题 I'm trying to use time windows over Flink SQL, it has been hard for me to get familiar with the framework, but I have already defined my StreamExecutionEnvironment StreamTableEnvironment FlinkKafkaConsumer Then apply query SQL and group by time windows as follows. val stream = env.addSource(new FlinkKafkaConsumer[String]("flink", new SimpleStringSchema(), properties) ) val parsed: DataStream[Order] = stream.map(x=> .... //then I register a DataStream as a table, (Flink Version: 9.3) tEnv

flink start scala shell - numberformat exepction

醉酒当歌 提交于 2020-06-17 09:09:21
问题 How can I start a flink interactive (scala) shell? Preferably, using scala 2.12. However, it looks like only 2.11 is working for now. Anyways. When using 2.11, i.e. downloading https://www.apache.org/dyn/closer.lua/flink/flink-1.10.1/flink-1.10.1-bin-scala_2.11.tgz unzipping executing ./bin/start-scala-shell.sh local I get the following error: [ERROR] Failed to construct terminal; falling back to unsupported java.lang.NumberFormatException: For input string: "0x100" at java.lang

How to store checkpoint into remote RocksDB in Apache Flink

人走茶凉 提交于 2020-06-17 09:07:07
问题 I know that there are three kinds of state backends in Apache Flink: MemoryStateBackend, FsStateBackend and RocksDBStateBackend. MemoryStateBackend stores the checkpoints into local RAM, FsStateBackend stores the checkpoints into local FileSystem, and RocksDBStateBackend stores the checkpoints into RocksDB. I have some questions about the RocksDBStateBackend. As my understanding, the mechanism of RocksDBStateBackend has been embedded into Apache Flink. The rocksDB is a kind of key-value DB.

Is one TaskManager with three slots the same as three TaskManagers with one slot in Apache Flink

旧城冷巷雨未停 提交于 2020-06-17 02:53:25
问题 In Flink, as my understanding, JobManager can assign a job to multiple TaskManagers with multiple slots if necessary. For example, one job can be assigned three TaskManagers, using five slots. Now, saying that I execute one TaskManager(TM) with three slots, which is assigned to 3G RAM and one CPU. Is this totally the same as executing three TaskManagers, sharing one CPU, and each of them is assigned to 1 G RAM? case 1 --------------- | 3G RAM | | one CPU | | three slots | | TM | -------------

Is one TaskManager with three slots the same as three TaskManagers with one slot in Apache Flink

依然范特西╮ 提交于 2020-06-17 02:53:05
问题 In Flink, as my understanding, JobManager can assign a job to multiple TaskManagers with multiple slots if necessary. For example, one job can be assigned three TaskManagers, using five slots. Now, saying that I execute one TaskManager(TM) with three slots, which is assigned to 3G RAM and one CPU. Is this totally the same as executing three TaskManagers, sharing one CPU, and each of them is assigned to 1 G RAM? case 1 --------------- | 3G RAM | | one CPU | | three slots | | TM | -------------

Create FlinkSQL UDF with generic return type

假装没事ソ 提交于 2020-06-16 03:53:39
问题 I would like to define function MAX_BY that takes value of type T and ordering parameter of type Number and returns max element from window according to ordering (of type T ). I've tried public class MaxBy<T> extends AggregateFunction<T, Tuple2<T, Number>> { @Override public T getValue(Tuple2<T, Number> tuple) { return tuple.f0; } @Override public Tuple2<T, Number> createAccumulator() { return Tuple2.of(null, 0L); } public void accumulate(Tuple2<T, Number> acc, T value, Number order) { if

Apache Flink: java.lang.NoClassDefFoundError

ⅰ亾dé卋堺 提交于 2020-06-15 04:25:07
问题 I'm trying to follow this example but when I try to compile it, I have this error: Error: Unable to initialize main class com.amazonaws.services.kinesisanalytics.aws Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction The error is due this code: private static DataStream<String> createSourceFromStaticConfig(StreamExecutionEnvironment env) { Properties inputProperties = new Properties(); inputProperties.setProperty(ConsumerConfigConstants

MongoDB as datasource to Flink

冷暖自知 提交于 2020-06-11 08:43:05
问题 Can MongoDB be used as a datasource to Apache Flink for processing the Streaming Data? What is the native implementation of Apache Flink to use No-SQL Database as data source? 回答1: Currently, Flink does not have a dedicated connector to read from MongoDB. What you can do is the following: Use StreamExecutionEnvironment.createInput and provide a Hadoop input format for MongoDB using Flink's wrapper input format Implement your own MongoDB source via implementing SourceFunction /

Apache Flink: ProcessWindowFunction KeyBy() multiple values

荒凉一梦 提交于 2020-06-07 06:08:26
问题 I'm trying to use WindowFunction with DataStream, my goal is to have a Query like the following SELECT *, count(id) OVER(PARTITION BY country) AS c_country, count(id) OVER(PARTITION BY city) AS c_city, count(id) OVER(PARTITION BY city) AS c_addrs FROM fm ORDER BY country have helped me for the aggregation by the country field, but I need to do the aggregation by two fields in the same time window. I don't know if it is possible to have two or more keys in keyBy( ) for this case val parsed =