flink-sql

flink count distinct issue

怎甘沉沦 提交于 2021-02-11 14:26:39
问题 Now we use tumbling window to count distinct. The issue we have is if we extend our tumbling window from day to month, We can't have the number as of now distinct count. That means if we set the tumbling window as 1 month, the number we get is from every 1st of each month. How can I get the current distinct count for now(Now is Mar 9.)? package flink.trigger; import org.apache.flink.api.common.state.ReducingState; import org.apache.flink.api.common.state.ReducingStateDescriptor; import org

How to check DataStream in flink is empty or having data

╄→гoц情女王★ 提交于 2021-01-29 10:33:01
问题 I am new to Apache flink i have a datastream which implements a process function if certain conditions is met then the datastream is valid and if its not meeting the conditions i am writing it to sideoutput. I am able to print the datastream is it possible to check the datastream is empty or null.I tried using datastream.equals(null) method but its not working.Please suggest how to know whether a datastream is empty or not 回答1: By "empty", I assume you mean that no data is flowing. What are

Flink Table API & SQL and map types (Scala)

怎甘沉沦 提交于 2021-01-28 12:42:23
问题 I am using Flink's Table API and/or Flink's SQL support (Flink 1.3.1, Scala 2.11) in a streaming environment. I'm starting with a DataStream[Person] , and Person is a case class that looks like: Person(name: String, age: Int, attributes: Map[String, String]) All is working as expected until I start to bring attributes into the picture. For example: val result = streamTableEnvironment.sql( """ |SELECT |name, |attributes['foo'], |TUMBLE_START(rowtime, INTERVAL '1' MINUTE) |FROM myTable |GROUP

Flink SQL Client connect to non local cluster

拜拜、爱过 提交于 2021-01-28 11:51:04
问题 Is it possible to connect the flink sql client to a remote cluster? I assume the client uses some configuration to determine job manager address, but I don’t see it mentioned in docs. 回答1: Yes, that's possible. You can configure the connection to a remote cluster in the conf/flink-conf.yaml file: jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 来源: https://stackoverflow.com/questions/61623836/flink-sql-client-connect-to-non-local-cluster

Create FlinkSQL UDF with generic return type

假装没事ソ 提交于 2020-06-16 03:53:39
问题 I would like to define function MAX_BY that takes value of type T and ordering parameter of type Number and returns max element from window according to ordering (of type T ). I've tried public class MaxBy<T> extends AggregateFunction<T, Tuple2<T, Number>> { @Override public T getValue(Tuple2<T, Number> tuple) { return tuple.f0; } @Override public Tuple2<T, Number> createAccumulator() { return Tuple2.of(null, 0L); } public void accumulate(Tuple2<T, Number> acc, T value, Number order) { if