When I first posted this question I had strong coupling between my web service and application controller where the controller needed to open multiple threads to the service and
This sounds like a perfect use case for Windows Workflow Foundation. You can easily create a workflow to get information from each supplier, then merge the results when ready. It's much cleaner, and WF will do all the async stuff for you.
The absolute best way to archive in your scenario and technology would be having some kind of token between your web app / library against your web service and your controller needs to have a thread to check if there are new results etc. However please note that you will require to get the complete data back from your WS as it's merge can result in removed items from the initial response.
Or I still think that handling threads would be better from controller with the use of WCF Webservices
Not sure if this solution fits your particular task, but anyway:
When your application asks for the first page, return it as soon as it's ready and put the whole bunch of collected/merged data to cache so when the next page is required you may use what is already prepared.
But note that you won't get the most up-to-date data, configure cache reloading interval cautiously.
One of the ways to achieve this is by invoking your WS asynchronously (http://www.stardeveloper.com/articles/display.html?article=2001121901&page=1, http://www.ondotnet.com/pub/a/dotnet/2005/08/01/async_webservices.html), and then updating the GUI in the callback.
However, you could have timeout problems if the querying of data takes too long. For example, if one of the supplier's web site is down or very slow, this could mean that the whole query could fail. Maybe it would be better if your business logic on the client side does the merging instead of WS doing it.
I'm not so sure that duplex is needed here... IMO, a standard async call with a callback should be more than sufficient to get notification of data delivery.
What is the biggest problem? If you are talking about async etc, then usually we are talking about the time taken to get the data to the client. Is this due to sheer data volume? or complexity generating the data at the server?
If it is the data volume, then I can think of a number of ways of significantly improving performance - although most of them involve using DTO objects (not DataSet
/DataTable
, which seemed to be implied in the question). For example, protobuf-net significantly reduces the data volume and processing required to transfer data.
OK the solution to my problem came from WCF
In addition to classic request-reply operation of ASMX web services, WCF supports additional operation types like; one-way calls, duplex callbacks and streaming.
Not too hard to guess, duplex callback was what I was looking for.
Duplex callbacks simply allow the service to do call backs to the client. A callback contract is defined on the server and client is required to provide the callback endpoint on every call. Then it is up to the service to decide when and how many times to use the callback reference.
Only bidirectiona-capable bindings support callback operations. WCF offers the WSDualHttpBinding to support callbacks over HTTP (Callback support also exists by NetNamedPipeBinding and NetTcpBinding as TCP and IPC protocols support duplex communication)
One very important thing to note here is that duplex callbacks are nonstandard and pure Microsoft feature. This is not creating a problem on my current task at hand as both my web service and application are running on Microsoft ASP.NET
Programming WCF Services gave me a good jump start on WCF. Being over 700 pages it delves deep into all WCF consepts and has a dedicated chapter on the Callback and other type of operations.
Some other good resources I found on the net are;
Windows Communication Foundation (WCF) Screencasts
MSDN Webcast: Windows Communication Foundation Top to Bottom
Web Service Software Factory
The Service Factory for WCF