Is it possible to integrate AWS Lambda with Apache Kafka ? I want to put a consumer in a lambda function. When a consumer receive a message the lambda function execute.
Yes it is very much possible to have a Kafka consumer in AWS Lambda function.
However note that you would not be able to invoke the lambda using some sort of notification. You will rather have to poll the Kafka topic. And the easiest way can be to use a Scheduled Lambda
If you are using managed apache kafka in AWS (MSK):
Since august 2020 you can connect AWS Managed Streaming for Kafka (MSK) as event source. Not your own installed kafka cluster but if you already uses AWS managed kafka this could be useful.
More in the announcement https://aws.amazon.com/about-aws/whats-new/2020/08/aws-lambda-now-supports-amazon-managed-streaming-for-apache-kafka-as-an-event-source/
Screenshot from AWS Console:
AWS now supports "self-hosted Apache Kafka as an event source for AWS Lambda"
When you create a new Lambda, in the "Configuration" tab, click "Add trigger", you can now select and configure your self-hosted Apache Kafka.
Feel free to read more here:
https://aws.amazon.com/blogs/compute/using-self-hosted-apache-kafka-as-an-event-source-for-aws-lambda/
https://docs.aws.amazon.com/lambda/latest/dg/kafka-smaa.html
Continuing the point by Arafat. We have successfully built an infrastructure to consume from Kafka using AWS Lambdas. Here are some gotcha's:
context
object in the Lambda and give yourself some wiggle room to do something with the buffer you populated in your consumer which might not be read to a file unless you call close()
.We are using Apache Airflow for scheduling. I hear cloudwatch can do that too.
Here is AWS article on scheduled lambdas.
Given your Kafka installation will be running in a VPC, best practise is to configure your Lambda to run within the VPC as well - this will simplify the security group configuration for the EC2 instances running Kafka.
Here is the AWS blog article on configuring Lambdas to run in a VPC.
There is a community-provided Kafka Connector for AWS Lambda. This solution would require you to run the connector somewhere such as EC2 or ECS.