Update KTable based on partial data attributes

依然范特西╮ 提交于 2020-01-05 17:57:51

问题


I am trying to update a KTable with partial data of an object. Eg. User object is {"id":1, "name":"Joe", "age":28} The object is being streamed into a topic and grouped by key into KTable. Now the user object is updated partially as follows {"id":1, "age":33} and streamed into table. But the updated table looks as follows {"id":1, "name":null, "age":28}. The expected output is {"id":1, "name":"Joe", "age":33}. How can I use Kafka streams and spring cloud streams to achieve the expected output. Any suggestions would be appreciated. Thanks.

Here is the code

 @Bean
        public Function<KStream<String, User>, KStream<String, User>> process() {
            return input -> input.map((key, user) -> new KeyValue<String, User>(user.getId(), user))
                    .groupByKey(Grouped.with(Serdes.String(), new JsonSerde<>(User.class))).reduce((user1, user2) -> {
                        user1.merge(user2);
                        return user1;
                    }, Materialized.as("allusers")).toStream();
        }

and modified the User object with below code:

    public void merge(Object newObject) {
        assert this.getClass().getName().equals(newObject.getClass().getName());
        for (Field field : this.getClass().getDeclaredFields()) {
            for (Field newField : newObject.getClass().getDeclaredFields()) {
                if (field.getName().equals(newField.getName())) {
                    try {
                        field.set(this, newField.get(newObject) == null ? field.get(this) : newField.get(newObject));
                    } catch (IllegalAccessException ignore) {
                    }
                }
            }
        }
    }

Is this the right approach or any other approach in KStreams?


回答1:


I've tested your merge code, and it seems to be working as expected. But since your result after the reduce is {"id":1, "name":null, "age":28}, I can think of two things:

  • Your state isn't being updated at all, since no attribute has changed.
  • Maybe you have a serialization problem, since the string attribute is null, but the other int attributes are fine.

My guess is that, because you are mutating the original object and return the same value, kafka streams doesn't detect that as a change and won't store the new state. Actually, you shouldn't mutate your object, since it could lead to non-determinism depending on your pipeline.

Try to change your merge function to create a new User object, and see if the behavior changes.




回答2:


So here is the recommended generic approach for merging the 2 objects, please feel to comment here. For this to work the the object being merged should have an empty constructor.

     public <T> T mergeObjects(T first, T second) {
        Class<?> clazz = first.getClass();
        Field[] fields = clazz.getDeclaredFields();
        Object newObject = null;
        try {
            newObject = clazz.getDeclaredConstructor().newInstance();
            for (Field field : fields) {
                field.setAccessible(true);
                Object value1 = field.get(first);
                Object value2 = field.get(second);
                Object value = (value2 == null) ? value1 : value2;
                field.set(newObject, value);
            }
        } catch (InstantiationException | IllegalAccessException | IllegalArgumentException
                | InvocationTargetException | NoSuchMethodException | SecurityException e) {

            e.printStackTrace();
        }
        return (T) newObject;
    }


来源:https://stackoverflow.com/questions/58960806/update-ktable-based-on-partial-data-attributes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!