real time log processing using apache spark streaming

后端 未结 3 549
情书的邮戳
情书的邮戳 2021-02-06 02:05

I want to create a system where I can read logs in real time, and use apache spark to process it. I am confused if I should use something like kafka or flume to pass the logs to

3条回答
  •  灰色年华
    2021-02-06 02:42

    You can use Apache Kafka as queue system for your logs. The system that generated your logs e.g websever will send logs to Apache KAFKA. Then you can use apache storm or spark streaming library to read from KAFKA topic and process logs at real time.

    You need to create stream of logs , which you can create using Apache Kakfa. There are integration available for kafka with storm and apache spark. both has its pros and cons.

    For Storm Kafka Integration look here

    For Apache Spark Kafka Integration take a look here

提交回复
热议问题