Currently i have a setup where my clients (web apps, iOS app etc) talks to my backend API .NET web app (Nancy) via REST calls. Nothing special.
I now have a requirement
To start I would recommend breaking up your monolithic REST service into multiple micro-services, each specific to a single data entity or use case. For example: Sending emails out would be a single use case, and querying results for a search page would be a single use case.
Then as you roll out each individual service, you could modify your monolithic service to be a proxy to the new micro-services. This way your existing apps could keep calling your monolithic service, and the new app updates could call the micro-services appropriately.
Basically, migrate from a monolithic service to micro-services incrementally, one at a time.
The only approach that comes to mind is simple REST-based calls. Totally fine, but latency is the main issue here.
Using the monolithic web service as a Proxy to the micro-services will add some latency, but the individual micro-services won't add any latency, and you can scale those out to as many hosted instances behind an Azure Load Balancer as needed.
If you do see query latency, then you could implement something like a Redis cache for any micro-services that exhibit performance problems with queries.
If you do see write latency that becomes a problem, then I would recommend moving those problem writes to use some kind of Messaging Queue to allow for them to be handled asynchronously if possible.
Is there anything in Azure suited to this?
You may want to look into Azure API Apps, as they may offer some nice functionality for securing / authenticating with your micro-services via Azure more easily than securing the individually.
What's the different ways i could communicate between my main API and other microservice API's?
It sounds like you need real-time communication between services. So, a message queue is out of the question, unless you have a task that can be performed in the background. You could use simple REST calls between services, or even something like SOAP if implemented with WCF. Although, it sounds like you currently are using REST, so it makes sense to keep it the same.
Also, make sure your micro-services aren't communicating with each other through a database, as that can get messy real quick. Use a simple REST endpoint for them to communicate, similar to the REST endpoints used by the client apps.
I like to follow the KISS (Keep It Simple Stupid) principle, so I would recommend you try not to over architect the solution and just keep it as simple as possible.
Apache Thrift was designed to communicate with high efficiency. It was developed by Facebook (now open source), and an implementation is available for C#.
Google developed the highly efficient protocol buffers serialization mechanism. Protocol buffers does not include RPC (remote procedure call) but can be used to send data over a variety of transport mechanisms. It is also available for C# (project by SO's own Marc Gravell).
Is there anything in Azure suited to this?
There is another solution in Azure for the microservice approach, one that I think will get a lot of traction, called Service Fabric.
The concern here is that you have an existing application, and it might be a lot harder to adapt it to work with Service Fabric, but it is certainly worth a look.
Here you can find a Service Fabric example of a web service.
Hope this helps!
Best of luck!
First - clarify your distinctions between "real-time", "synchronous/asynchronous" and "one-way/two-way". The things you rule out (queues, and pub/sub) can certainly be used for two-way request/response, but they are asynchronous.
Second - clarify "efficiency" - efficiency on what metric? Bandwidth? Latency? Development time? Client support?
Third - realize that (one of) the costs of microservices is latency. If that's an issue for you at your first integration, you're likely in for a long road.
What's the different ways i could communicate between my main API and other microservice API's? Pros/cons of each approach?
Off the top of my head:
You'll note that this is the same list when we tie multiple applications together...because that's what you're doing. Just cause you made the applications smaller doesn't really change much except makes your system even more distributed. Expect to solve all the same problems "normal" distributed systems have, and then a few extra ones related to deployment and versioning.
Consider an idempotent GET request from a user like "Get me question 1". That client expects a JSON response of question 1. Simple. In my expected architecture, the client would hit api.myapp.com, which would then proxy a call via REST to question-api.myapp.com (microservice) to get the data, then return to user. How could we use pub/sub here? Who is the publisher, who is the subscriber? There's no event here to raise. My understanding of queues: one publisher, one consumer. Pub/sub topic: one publisher, many consumers. Who is who here?
Ok - so first, if we're talking about microservices and latency - we're going to need a more representative example. Let's say our client is the Netflix mobile app, and to display the opening screen it needs the following information:
Each one of those is provided by a different microservice (we'll call them M1-M5). Each call from client -> datacenter has 100ms expected latency; calls between services have 20ms latency.
Let's compare some approaches:
As expected, that's the lowest latency option - but requires everything in a monolithic service, which we've decided we don't want because of operational concerns.
That's 500ms. Using a proxy w/this isn't going to help - it'll just add 20ms latency to each request (making it 600ms). We have a dependency between 1 + 2 and 4, and 3 and 5, but can do some async. Let's see how that helps.
We're down to 200ms; not bad - but our client needs to know about our microservice architecture. If we abstract that with our proxy, then we have:
Down to 140ms, since we're leveraging the decreased intra-service latency.
Great - when things are working smoothly, we've only increased latency by 40% compared to monolithic (#1).
But, as with any distributed system, we also have to worry about when things aren't going smoothly.
What happens when M4's latency increases to 200ms? Well, in the client -> async microservice route (#3), then we have partial page results in 100ms (the first batch of requests), unavailable in 200ms and summaries in 400ms. In the proxy case (#4), we have nothing until 340ms. Similar considerations if a microservice is completely unavailable.
Queues are a way of abstracting producer/consumer in space and time. Let's see what happens if we introduce one:
Our client, who is subscribed to P2 - receives partial results w/a single request and is abstracted away from the workflow between M1 + M2 and M4, and M3 and M5. Our latency in best case is 140ms, same as #4 and in worst case is similar to the direct client route (#3) w/partial results.
We have a much more complicated internal routing system involved, but have gained flexibility w/microservices while minimizing the inevitable latency. Our client code is also more complex - since it has to deal with partial results - but is similar to the async microservice route. Our microservices are generally independent of each other - they can be scaled independently, and there is no central coordinating authority (like in the proxy case). We can add new services as needed by simply subscribing to the appropriate channels, and having the client know what to do with the response we generate (if we generate one for client consumption of course).
You could do a variation of using a gateway to aggregate responses, while still using queues internally. It would look a lot like #4 externally, but #5 internally. The addition of a queue (and yes, I've been using queue, pub/sub, topic, etc. interchangeably) still decouples the gateway from the individual microservices, but abstracts out the partial result problem from the client (along w/its benefits, though).
The addition of a gateway, though, does allow you to handle the partial result problem centrally - useful if it's complex, ever changing, and/or reimplemented across multiple platforms.
For instance, let's say that, in the event that M4 (the summary service) is unavailable - we have a M4b that operates on cached data (so, eg., the star rating is out of date). M4b can answer the R4a and R4b immediately, and our Gateway can then determine if it should wait for M4 to answer or just go w/M4b based on a timeout.
For further info on how Netflix actually solved this problem, take a look at the following resources: