Kafka Streams API: Session Window exception

笑着哭i 提交于 2020-06-17 12:57:36

问题


I am trying to create a Kafka topology and break it down into more readable. I have a stream that I group by keys, and then I am trying to window it like so:

SessionWindowedKStream<byte[], byte[]> windowedTable =
        groupedStream.windowedBy(SessionWindows.with(Duration.ofSeconds(config.joinWindowSeconds)).grace(Duration.ZERO));

KTable<Windowed<byte[]>, byte[]> mergedTable = windowedTable
        .reduce((aggregateValue, newValue) -> {
          try {
            Map<String, String> recentMap = MAPPER.readValue(new String(newValue), HashMap.class);
            Map<String, String> aggregateMap = MAPPER.readValue(new String(newValue), HashMap.class);
            aggregateMap.forEach(recentMap::putIfAbsent);
            newValue = MAPPER.writeValueAsString(recentMap).getBytes();
          } catch (Exception e) {
            LOG.warn("Couldn't aggregate key grouped stream\n", e);
          }
          return newValue;
        }, Materialized.with(Serdes.ByteArray(), Serdes.ByteArray()));

      mergedTable.toStream()
              .foreach((externalId, eventIncidentByteMap) -> {
          ...
}

Unfortunately, the following exception is thrown:

00:40:11.344 [main] ERROR o.a.k.s.p.i.ProcessorStateManager - stream-thread [main] task [0_0] Failed to flush state store KSTREAM-REDUCE-STATE-STORE-0000000020: 
org.apache.kafka.streams.errors.ProcessorStateException: Error opening store KSTREAM-REDUCE-STATE-STORE-0000000020.1589846400000 at location /tmp/kafka-streams/test-consumer/0_0/KSTREAM-REDUCE-STATE-STORE-0000000020/KSTREAM-REDUCE-STATE-STORE-0000000020.1589846400000
    at org.apache.kafka.streams.state.internals.RocksDBStore.openRocksDB(RocksDBStore.java:220)
    at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:191)
    at org.apache.kafka.streams.state.internals.KeyValueSegment.openDB(KeyValueSegment.java:49)
    at org.apache.kafka.streams.state.internals.KeyValueSegments.getOrCreateSegment(KeyValueSegments.java:50)
    at org.apache.kafka.streams.state.internals.KeyValueSegments.getOrCreateSegment(KeyValueSegments.java:25)
    at org.apache.kafka.streams.state.internals.AbstractSegments.getOrCreateSegmentIfLive(AbstractSegments.java:84)
    at org.apache.kafka.streams.state.internals.AbstractRocksDBSegmentedBytesStore.put(AbstractRocksDBSegmentedBytesStore.java:146)
    at org.apache.kafka.streams.state.internals.RocksDBSessionStore.put(RocksDBSessionStore.java:81)
    at org.apache.kafka.streams.state.internals.RocksDBSessionStore.put(RocksDBSessionStore.java:25)
    at org.apache.kafka.streams.state.internals.ChangeLoggingSessionBytesStore.put(ChangeLoggingSessionBytesStore.java:74)
    at org.apache.kafka.streams.state.internals.ChangeLoggingSessionBytesStore.put(ChangeLoggingSessionBytesStore.java:33)
    at org.apache.kafka.streams.state.internals.CachingSessionStore.putAndMaybeForward(CachingSessionStore.java:90)
    at org.apache.kafka.streams.state.internals.CachingSessionStore.lambda$initInternal$0(CachingSessionStore.java:73)
    at org.apache.kafka.streams.state.internals.NamedCache.flush(NamedCache.java:151)
    at org.apache.kafka.streams.state.internals.NamedCache.flush(NamedCache.java:109)
    at org.apache.kafka.streams.state.internals.ThreadCache.flush(ThreadCache.java:124)
    at org.apache.kafka.streams.state.internals.CachingSessionStore.flush(CachingSessionStore.java:230)
    at org.apache.kafka.streams.state.internals.WrappedStateStore.flush(WrappedStateStore.java:84)
    at org.apache.kafka.streams.state.internals.MeteredSessionStore.lambda$flush$5(MeteredSessionStore.java:227)
    at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:806)
    at org.apache.kafka.streams.state.internals.MeteredSessionStore.flush(MeteredSessionStore.java:227)
    at org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:282)
    at org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:177)
    at org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:554)
    at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:490)
    at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:478)
    at org.apache.kafka.streams.TopologyTestDriver.completeAllProcessableWork(TopologyTestDriver.java:517)
    at org.apache.kafka.streams.TopologyTestDriver.pipeRecord(TopologyTestDriver.java:472)
    at org.apache.kafka.streams.TopologyTestDriver.pipeRecord(TopologyTestDriver.java:806)
    at org.apache.kafka.streams.TestInputTopic.pipeInput(TestInputTopic.java:115)
    at org.apache.kafka.streams.TestInputTopic.pipeInput(TestInputTopic.java:137)
    at com.ro.revelon.pub.api.dp.EventConsumerTest.testEventWithIncident(EventConsumerTest.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: org.rocksdb.RocksDBException: You have to open all column families. Column families not opened: keyValueWithTimestamp
    at org.rocksdb.RocksDB.open(Native Method)
    at org.rocksdb.RocksDB.open(RocksDB.java:286)
    at org.apache.kafka.streams.state.internals.RocksDBStore.openRocksDB(RocksDBStore.java:217)
    ... 53 common frames omitted

I am not exactly sure if the issue is with the Serdes that aren't specified somewhere. I did use .groupByKey(Grouped.with(Serdes.ByteArray(), Serdes.ByteArray())) when grouping by key. I suspect that I haven't mapped something properly along the way.

Caused by: org.rocksdb.RocksDBException: You have to open all column families. Column families not opened: keyValueWithTimestamp is also suspicious and mysterious to me. Either way, I am not sure how to tackle the problem.

I know that the following code does works:


KTable<byte[], byte[]> mergedTable = groupedStream
        .reduce((aggregateValue, newValue) -> {
          try {
            Map<String, String> recentMap = MAPPER.readValue(new String(newValue), HashMap.class);
            Map<String, String> aggregateMap = MAPPER.readValue(new String(newValue), HashMap.class);
            aggregateMap.forEach(recentMap::putIfAbsent);
            newValue = MAPPER.writeValueAsString(recentMap).getBytes();
          } catch (Exception e) {
            LOG.warn("Couldn't aggregate key grouped stream\n", e);
          }
          return newValue;
        }, Materialized.with(Serdes.ByteArray(), Serdes.ByteArray()));

mergedTable.toStream()
        .foreach((externalId, eventIncidentByteMap) -> {
          ...
  }

How do I break it down without tripping on the rocksdb store exception?


回答1:


Did you downgrade your Kafka Streams library? In 2.3.0, the storage format was changed and this new storage format is not compatible to older Kafka Streams versions.

If you want to downgrade from a version 2.3.0 (or higher) to a version 2.2.x (or lower), you need to wipe out your local state first (eg, manually deleting the application state directory or via KafkaStreams#cleanup()). On restart, the state will be rebuild from the changelog topic using the old storage format.



来源:https://stackoverflow.com/questions/61883482/kafka-streams-api-session-window-exception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!