Currently I\'m developing a spatial data processing server. Here are requirements:
Just listening in on the conversation really, but am i right to think that MSMQ will actually help with the concurrency problem by buffering messages. So the server reading from the queue will never get flooded? That would change the problem on the component that is processing the messgaes from 'event based' concurrency (like on a webserver) to a much simpler pull mechanism.
If you're still in a greenfield design stage you might also want to look at CCR & DSS, these could also help with the concurrency. Very impressive stuff, but then again if you only need to store the messages in a DB it's probably not going to help you much.
Thanks for your answer and for links. Im looking on msmq as reliable transport for huge amount of messages. 1-2 mil msg per day. Also it is very important to guarantee, that those messages on server side wont be lost during server working. So i think i need durability, recoverable messaging, disconnected clients, etc..
As for load balancing: main load will be on data parsing and data publishing services. And idea is to push raw messages into one queue and have for example 3 instances of data parsing service on diffirent PCs to parse messages from it and store them in a single database. Same with data publishing service. Or do it like in StockTrader, cache channels to nodes and use wcf to send messages to them.
Also i cant find info about perfomance and scalabilty of msmq wpf bindings. Tryed StockTrader, but is is a bit overcomplicated for test purposes and quick analysis.
Thanks again for answers.
I am not really sure of the specific question you have, seems like more of a general design question, but as I love MSMQ I'll chime in here.
MSMQ does have some drawbacks, specifically about load balancing transactional messages, but on whole its pretty awesome.
However, none of your requirements mentioned any specific reason to use MSMQ: durability, recoverable messaging, disconnected clients, etc.. so I am assuming you have some of these requirements, but that they are not explicitly called out.
Requirement #1 should be easy to meet/beat, especially if these are small messages and there is no apparent logic being performed on them (e.g. just vanilla inserts/updates) and MSMQ handles competing consumers very well.
Requirement #2 unless your using transactional messaging with MSMQ, its not impossible to load balance MSMQ to enable scaling, but it has some caveats. How are you load balancing MSMQ? See How to Load-Balancing MSMQ: A Brief Discussion for some details if you don't already have them.
Barring potential snafu's with load-balancing MSMQ, none of which are insurmountable, there is nothing wrong with this approach.
Scaling MSMQ
MSMQ scales very well vertically (same machine) and moderately horizontally (many machines). However, it is difficult to make MSMQ truly highly available, which may or may not be a concern. See the links already in this answer for thoughts on making it highly available.
Scaling Vertically
When scaling MSMQ vertically, there are many instances of the queue reader(s) running on a single machine, reading from a single queue. MSMQ handles this very well. All of our queue data is in a temporal store on the local machine.
What happens if we lose the machine hosting the queue?
Clients can send and messages will stack up in the outgoing queue of the client, but we can't receive them until the server comes back up.
What happens to the messages in the queue?
Without the introduction of some sort of highly available backed disk subsystem they are likely gone. Even so, getting another queue 'hooked up' to that data file can be a challenge. In theory, you can get them back. In practice, its likely easier to resend the message from the edge system.
Depending on transaction volumes, the queue may be empty the majority of the time so the risk of data loss needs to be weighed with the effort/cost of making it highly available.
Scaling Horizontally
When scaling MSMQ horizontally, there is an instance of a queue on each processing machine. Each machine may have [n] readers for that queue running on the machine receiving messages from the queue. All of our queue data is in temporal stores on several machines.
You can use any of the methods described in the documentation for load-balancing MSMQ, however, my experience has almost always been with application load balancing, described as software load-balancing: the client has a list of available msmq end points to deliver to. Each server hosting the queue is subject to the same availability problems as the single queue.
Also might want to check out Nine Tips to Enterprise-proof MSMQ for some additional MSMQ related information.