I am trying to load a data file in loop(to check stats) instead of standard input in Kafka. After downloading Kafka, I performed the following steps:
Started zookeeper:<
If there is always a single file, you can just use tail command and then pipeline it to kafka console producer.
But if a new file will be created when some conditions met, you may need use apache.commons.io.monitor to monitor new file created, then repeat above.
Kafka has this built-in File Stream Connector, for piping the content of a file to producer(file source), or directing file content to another destination(file sink).
We have bin/connect-standalone.sh
to read from file which can be configured in config/connect-file-source.properties
and config/connect-standalone.properties
.
So the command will be:
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties
The easiest way if you are using Linux or Mac is:
kafka-console-producer --broker-list localhost:9092 --topic test < messages.txt
Reference: https://github.com/Landoop/kafka-cheat-sheet
You can read data file via cat and pipeline it to kafka-console-producer.sh.
cat ${datafile} | ${kafka_home}/bin/kafka-console-producer.sh --broker-list ${brokerlist} --topic test
You can probably try the kafkacat utility as well. The readme on Github provides examples
It would be great if you could share which tool worked the best for you :)
Details from KafkaCat Readme:
Read messages from stdin, produce to 'syslog' topic with snappy compression
$ tail -f /var/log/syslog | kafkacat -b mybroker -t syslog -z snappy
Below command is ofcourse the easiest way to do that.
kafka-console-producer --broker-list localhost:9092 --topic test < message.txt
But sometimes it is not able to find the file. example :
C:\kafka_2.11-2.4.0\bin\windows>kafka-console-producer.bat --broker-list localhost:9092 --topic jason-input < C:\data\message.txt
you given the actual path but it is not able to find C at the current location so it will give the error : file not found. We would be thinking that we have given the actual path so it will go to root and it will start the path from there but it is finding the C(root) at the current place.
Solution for that is to give the ..\ into the path to move to the parent folder. for example. you are executing the command like
C:\kafka_2.11-2.4.0\bin\windows>kafka-console-producer.bat --broker-list localhost:9092 --topic jason-input < ..\..\..\data\message.txt
as of now i am into the windows folder. ..\ will move the current directory to bin folder and again ..\ will move the current directory to the kafka.... folder and again ..\ will move to the C:. so now my path starts. data and then message.txt