global state store don't create change-log topic what is the workaround if input topic to global store has null key?

て烟熏妆下的殇ゞ 提交于 2020-05-17 06:22:25

问题


I read lot about global state store that it does not create change-topic topic for restore instead it use the source topic as restore.

i am create custom key and store the data in global state store, but after restart it will gone because global store on restore will directly take data from source topic and bypass the processor.

my input topic has above data.

{
      "id": "user-12345",
      "user_client": [
        "clientid-1",
        "clientid-2"
      ]
} 

i am maintaining two state store as follow:

  1. id ->record (record means above json)
  2. clientid-1: ["user-12345"] (clientid -> user-id)
  3. clientid-2: ["user-12345"] (clientid -> user-id)

So i have seen workaround is to create a custom change-log topic and send data with key to that topic that will act as a source topic for the global state store.

but in my scenario i have to fill two record in state store what is the best way to do it.

Example Scenario:

Record1: {
          "id": "user-1",
          "user_client": [
            "clientid-1",
            "clientid-2"
          ]
    } 



 Record2:{
          "id": "user-2",
          "user_client": [
            "clientid-1",
            "clientid-3"
          ]
    } 

Global-state store should have:

id -> json Record'

clientid-1: ["user-1", "user-2"]
clientid-2: ["user-2"]
clientid-3: ["user-2"]

how to maintain the restore case for the above scenario in global state store


回答1:


One approach is we maintain a changelog topic (has retention.policy=compact) for GlobalKTable, let call it user_client_global_ktable_changelog, for the sake of simplicity, let say we serialize your message to java classes (you can just use HashMap or JsonNode or something):

//initial message format
public class UserClients {
    String id;
    Set<String> userClient;
}
//message when key is client
public class ClientUsers {
    String clientId;
    Set<String> userIds;
}
//your initial topic
KStream<String, UserClients> userClientKStream = streamsBuilder.stream("un_keyed_topic");
  1. It easy to re-key the record to user_id, just rekey the KStream then send it to the output topic
//re-map initial message to user_id:{inital_message_payload}
userClientKStream
        .map((defaultNullKey, userClients) -> KeyValue.pair(userClients.getId(), userClients))
        .to("user_client_global_ktable_changelog");//please provide appropriate serdes
  1. Aggregate user_id for a particular client, we can use a local state (KTable) for keeping the (current user_ids list of current client_id):
userClientKStream
        //will cause data re-partition before running groupByKey (will create an internal -repartition topic)
        .flatMap((defaultNullKey, userClients)
                -> userClients.getUserClient().stream().map(clientId -> KeyValue.pair(clientId, userClients.getId())).collect(Collectors.toList()))
        //we have to maintain a current aggregated store for user_ids for a particular client_id
        .groupByKey()
        .aggregate(ClientUsers::new, (clientId, userId, clientUsers) -> {
            clientUsers.getUserIds().add(userId);
            return clientUsers;
        }, Materialized.as("client_with_aggregated_user_ids"))
        .toStream()
        .to("user_client_global_ktable_changelog");//please provide appropriate serdes

E.g for aggregating user_ids in local state:

//re-key message for client-based message
clientid-1:user-1
//your current aggregated for `clientid-1`
"clientid-1"
{
    "user_id": ["user-1"]
}

//re-key message for client-based message
clientid-1:user-2
//your current aggregated for `clientid-1`
"clientid-1"
{
    "user_id": ["user-1", "user-2"]
}

Actually we could use the changelog topic of the local state as changelog for GlobalKTable directly if you make some change, which is topic your_application-client_with_aggregated_user_ids-changelog, by adjust the state to keep both the payload of user key and client key message.



来源:https://stackoverflow.com/questions/60613596/global-state-store-dont-create-change-log-topic-what-is-the-workaround-if-input

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!