Create FlinkSQL UDF with generic return type

假装没事ソ 提交于 2020-06-16 03:53:39

问题


I would like to define function MAX_BY that takes value of type T and ordering parameter of type Number and returns max element from window according to ordering (of type T). I've tried

public class MaxBy<T> extends AggregateFunction<T, Tuple2<T, Number>> {

    @Override
    public T getValue(Tuple2<T, Number> tuple) {
        return tuple.f0;
    }

    @Override
    public Tuple2<T, Number> createAccumulator() {
        return Tuple2.of(null, 0L);
    }

    public void accumulate(Tuple2<T, Number> acc, T value, Number order) {
        if (order.doubleValue() > acc.f1.doubleValue()) {
            acc.f0 = value;
            acc.f1 = order;
        }
    }
}

but I cannot register such function using TableEnvironment.registerFunction. Underneath Flink uses TypeInformation to match types within SQL query and with such definition it cannot determine types (at least that's what I suppose). I saw that it is possible to provide several accumulate functions but still - I think return type must be same for each overloaded method.

Built-in aggregation functions work similarly to what I want to achieve - MAX can take arbitrary column type and return the same type. That's why I suppose I should be able to do it as well.


回答1:


Unfortunately, Flink doesn't support aggregation functions with flexible return types. For the MAX function, the internal implementation defines the core logic independent of the the type and then creates an implementation for every supported type (see code).

Internally, MAX is then mapped to the right implementation, depending on the type.

I don't think that's possible if you define and register a function as user-defined aggregation function.



来源:https://stackoverflow.com/questions/62027612/create-flinksql-udf-with-generic-return-type

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!