Best approach to ingest Streaming Data in Lagom Microservice

I am creating streaming analytics application in which each analytics/functionality will implement as a Microservice so that this analytics can able to use in the different project later.

I am using Lagom for creating Microservice. I am new in lagom that's why i came across with some doubts.

I don't understand what will be the best approach to Post my stream of data (coming from multiple sensors) to microservice and then this microservice publish data to kafka topic.

Does Lagom Feature of Stream Message in Service Description ServiceCall[ Source[String, NotUsed], Source[String, NotUsed]] is it the right approach to Post streams of data (Big Data) from hundreds of wifi sensors? Does it have a tenancy to deal with this huge amount of data streams receiving near real time (=5 Sec)?
secondaly, while publish data to kafka topics why I have to implement Persistent Entity (recommended by Lagom)? Because Kafka itself guarantees at-least-once delivery of message

My application is not CRUD application, it only support to process streaming data.

Lagom's streaming calls use WebSockets. It's built on Play's WebSocket support, which can scale to millions of connected clients. I wouldn't call hundreds of wifi sensors a huge amount of data, Lagom should easily handle it, and Lagom can also easily be scaled horizontally, so if the processing you're doing is heavy, you can easily spread that processing across many nodes.
Publishing an incoming WebSocket stream to Kafka is not currently supported in Lagom. While Kafka does guarantee at least once once a message is published to Kafka, there are no such guarantees when getting that message into Kafka in the first instance. For example, if you do a side effect, such as update a database, then publish a message, there's no guarantee that if the application crashes between when the database is updated, and when the message is published to Kafka, that that message will eventually be published to Kafka (in fact it won't be, that message will be lost). This is why Lagom encourages only database event streams to be published to Kafka, since publishing the event log in that way does guarantee that any database operation that then needs to be sent to Kafka does happen at least once. However, if you're not doing side effects, which it sounds like you're not, then this might not be relevant to you. What I would recommend in that case would be to use akka-streams-kafka (what Lagom's Kafka support is built on) directly.

I've raised an issue referencing your use case here.

来源：https://stackoverflow.com/questions/43255302/best-approach-to-ingest-streaming-data-in-lagom-microservice

标签

scala

apache-kafka

kafka-producer-api

lagom