I\'m building a straightforward processing pipeline where an item is fetched as an input, it is being operated by multiple processors in a sequential manner and finally it i
Merge
provides an overload which takes a max concurrency.
Its signature looks like: IObservable<T> Merge<T>(this IObservable<IObservable<T>> source, int maxConcurrency);
Here is what it would look like with your example (I refactored some of the other code as well, which you can take or leave):
return Observable
//Reactive while loop also takes care of the onComplete for you
.While(() => _provider.HasNext,
Observable.FromAsync(_provider.GetNextAsync))
//Makes return items that will only execute after subscription
.Select(item => Observable.Defer(() => {
return _processers.Aggregate(
seed: Observable.Return(item),
func: (current, processor) => current.SelectMany(processor.ProcessAsync));
}))
//Only allow 3 streams to be execute in parallel.
.Merge(3);
To break down what this does,
While
will check each iteration, if _provider.HasNext
is true,
if so then it will resubscribe to get the next value for
_provider
, otherwise it emits onCompleted
Defer
IObservable<IObservable<T>>
is passed to Merge
which subscribes to a max of 3 observables simultaneously. Alternative 1
If you also need to control the number of parallel requests you need to get a little trickier, since you will need to signal that your Observable
is ready for new values:
return Observable.Create<T>(observer =>
{
var subject = new Subject<Unit>();
var disposable = new CompositeDisposable(subject);
disposable.Add(subject
//This will complete when provider has run out of values
.TakeWhile(_ => _provider.HasNext)
.SelectMany(
_ => _provider.GetNextAsync(),
(_, item) =>
{
return _processors
.Aggregate(
seed: Observable.Return(item),
func: (current, processor) => current.SelectMany(processor.ProcessAsync))
//Could also use `Finally` here, this signals the chain
//to start on the next item.
.Do(dontCare => {}, () => subject.OnNext(Unit.Default));
}
)
.Merge(3)
.Subscribe(observer));
//Queue up 3 requests for the initial kickoff
disposable.Add(Observable.Repeat(Unit.Default, 3).Subscribe(subject.OnNext));
return disposable;
});
You might need to rearrange the code you posted but this would be one way to do it:
var eventLoopScheduler = new EventLoopScheduler ();
(from semaphore in Observable.Return(new Semaphore(2,2))
from input in GetInputObs()
from getAccess in Observable.Start(() => semaphore.WaitOne(),eventLoopScheduler)
from output in ProcessInputOnPipeline(input)
.SubscribeOn(Scheduler.Default)
.Finally(() => semaphore.Release())
select output)
.Subscribe(x => Console.WriteLine(x), ex => {});
I've modelled your pipeline as 1 Observable (which in reality would be composed of several smaller observables chained together)
Key thing is to make sure that the semaphore gets released no matter how the pipeline terminates (Empty/Error) otherwise the stream might hang, and so a Finally() is used call Release() on the semaphore. (Might be worth considering adding a Timeout as well on the pipeline observable if it is liable to never OnComplete()/OnError().
Edit:
As per below comments, I've added some scheduling around the semaphore access so that we don't block whoever is pushing these inputs into our stream. I've used an EventLoopScheduler so that all requests for semaphore access will queue up and be executed on 1 thread.
Edit: I do prefer Paul's answer though - simple, less scheduling, less synchronisation (merge uses a queue internally).