confluent | 易学教程

Kafka sink connector: No tasks assigned, even after restart

阅读更多关于 Kafka sink connector: No tasks assigned, even after restart

问题 I am using Confluent 3.2 in a set of Docker containers, one of which is running a kafka-connect worker. For reasons yet unclear to me, two of my four connectors - to be specific, hpgraphsl's MongoDB sink connector - stopped working. I was able to identify the main problem: The connectors did not have any tasks assigned, as could be seen by calling GET /connectors/{my_connector}/status . The other two connectors (of the same type) were not affected and were happily producing output. I tried

“log4j.properties was unexpected at this time” while trying to start Zookeeper in windows

阅读更多关于 “log4j.properties was unexpected at this time” while trying to start Zookeeper in windows

I am using kafka stream download from Confluent ( http://www.confluent.io/product/kafka-streams/ ). I am following the instructions to run Zookeeper and Kafka on Windows. But while I try to start ZooKeeper using the command D:\Softwares\confluent-3.0.1\bin\windows>zookeeper-server-start.bat ./etc/kafka/zookeeper.properties , I get the error D:\Softwares\confluent-3.0.1\bin\windows../../etc/kafka/log4j.properties was unexpected at this time. If I check the "zookeeper-server-start.bat" file the commands look ok and is like below.There also exists log4j.properties file under directory confluent-3

Restarting Kafka Connect S3 Sink Task Loses Position, Completely Rewrites everything

阅读更多关于 Restarting Kafka Connect S3 Sink Task Loses Position, Completely Rewrites everything

问题 After restarting a Kafka Connect S3 sink task, it restarted writing all the way from the beginning of the topic and wrote duplicate copies of older records. In other words, Kafka Connect seemed to lose its place. So, I imagine that Kafka Connect stores current offset position information in the internal connect-offsets topic. That topic is empty which I presume is part of the problem. The other two internal topics connect-statuses and connect-configs are not empty. connect-statuses has 52

How to populate the cache in CachedSchemaRegistryClient without making a call to register a new schema?

阅读更多关于 How to populate the cache in CachedSchemaRegistryClient without making a call to register a new schema?

we have a spark streaming application which integrates with Kafka, I'm trying to optimize it because it makes excessive calls to Schema Registry to download schema. The avro schema for our data rarely changes, and currently our application calls the Schema Registry whenever a record comes in, which is way too much. I ran into CachedSchemaRegistryClient from confluent, and it looked promising. Though after looking into its implementation I'm not sure how to use its built-in cache to reduce the REST calls to Schema Registry. The above link will bring you to the source code, here I'm pasting the

Push Data from Kafka Topic to PostgreSQL in JSON

阅读更多关于 Push Data from Kafka Topic to PostgreSQL in JSON

问题 Error after updates [2019-07-29 12:52:23,301] INFO Initializing writer using SQL dialect: PostgreSqlDatabaseDialect (io.confluent.connect.jdbc.sink.JdbcSinkTask:57) [2019-07-29 12:52:23,303] INFO WorkerSinkTask{id=sink-postgres-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:301) [2019-07-29 12:52:23,367] WARN [Consumer clientId=consumer-1, groupId=connect-sink-postgres] Error while fetching metadata with correlation id 2 : {kafkadad=LEADER_NOT

Kafka Streams with lookup data on HDFS

阅读更多关于 Kafka Streams with lookup data on HDFS

I'm writing an application with Kafka Streams (v0.10.0.1) and would like to enrich the records I'm processing with lookup data. This data (timestamped file) is written into a HDFS directory on daily basis (or 2-3 times a day). How can I load this in the Kafka Streams application and join to the actual KStream ? What would be the best practice to reread the data from HDFS when a new file arrives there? Or would it be better switching to Kafka Connect and write the RDBMS table content to a Kafka topic which can be consumed by all the Kafka Streams application instances? Update : As suggested

Push Data from Kafka Topic to PostgreSQL in JSON

阅读更多关于 Push Data from Kafka Topic to PostgreSQL in JSON

Error after updates [2019-07-29 12:52:23,301] INFO Initializing writer using SQL dialect: PostgreSqlDatabaseDialect (io.confluent.connect.jdbc.sink.JdbcSinkTask:57) [2019-07-29 12:52:23,303] INFO WorkerSinkTask{id=sink-postgres-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:301) [2019-07-29 12:52:23,367] WARN [Consumer clientId=consumer-1, groupId=connect-sink-postgres] Error while fetching metadata with correlation id 2 : {kafkadad=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient:1023) [2019-07-29 12:52:23,368] INFO Cluster ID:

Confluent's Kafka REST Proxy vs Kafka Client

阅读更多关于 Confluent's Kafka REST Proxy vs Kafka Client

I am curious about the advantages and disadvantages of Confluent's Kafka REST Proxy and the producer/consumer implemented with the kafka official client library. i know that Confluent's Kafka REST Proxy is used for administrative tasks and for languages not supported by the kafka client. So, what are the advantages of the kafka client? One advantage of a native client would be raw performance via direct TCP to the brokers rather than round trip HTTP serialization + JVM serialization taking place within the REST Proxy. A disadvantage with the above could be maintaining security policies for

Kafka sink connector: No tasks assigned, even after restart

阅读更多关于 Kafka sink connector: No tasks assigned, even after restart

I am using Confluent 3.2 in a set of Docker containers, one of which is running a kafka-connect worker. For reasons yet unclear to me, two of my four connectors - to be specific, hpgraphsl's MongoDB sink connector - stopped working. I was able to identify the main problem: The connectors did not have any tasks assigned, as could be seen by calling GET /connectors/{my_connector}/status . The other two connectors (of the same type) were not affected and were happily producing output. I tried three different methods to get my connectors running again via the REST API: Pausing and resuming the

Kafka connect cluster setup or launching connect workers

阅读更多关于 Kafka connect cluster setup or launching connect workers

I am going through kafka connect, and i am trying to get the concepts. Let us say I have kafka cluster (nodes k1, k2 and k3) setup and it is running, now i want to run kafka connect workers in different nodes say c1 and c2 in distributed mode. Few questions. 1) To run or launch kafka connect in distributed mode I need to use command ../bin/connect-distributed.sh , which is available in kakfa cluster nodes, so I need to launch kafka connect from any one of the kafka cluster nodes? or any node from where I launch kafka connect needs to have kafka binaries so that i will be able to use ../bin