How to merge multiple observables with order preservation and maximum concurrency?

旧街凉风 提交于 2021-02-05 06:39:25

问题


I searched for a duplicate and didn't find any. What I have is a nested observable IObservable<IObservable<T>>, and I want to flatten it to a IObservable<T>. I don't want to use the Concat operator because it delays the subscription to each inner observable until the completion of the previous observable. This is a problem because the inner observables are cold, and I want them to start emitting T values immediately after they are emitted by the outer observable. I also don't want to use the Merge operator because it messes the order of the emitted values. The marble diagram below shows the problematic (for my case) behavior of the Merge operator, as well as the Desirable merging behavior.

Stream of observables: +--1---2---3--|
Observable-1         :    +-A----------B-----|
Observable-2         :        +--C--------D-|
Observable-3         :            +-E--------F----|
Merge                : +----A----C--E--B--D--F----|
Desirable merging    : +----A----------BC-DE-F----|

All values emitted by the Observable-1 should precede any value emitted by the Observable-2. The same should be true with the Observable-2 and Observable-3, and so on.

What I like with the Merge operator is that it allows to configure the maximum concurrent subscriptions to inner observables. I would like to preserve this functionality with the custom MergeOrdered operator I am trying to implement. Here is my under-construction method:

public static IObservable<T> MergeOrdered<T>(
    this IObservable<IObservable<T>> source,
    int maximumConcurrency = Int32.MaxValue)
{
    return source.Merge(maximumConcurrency); // How to make it ordered?
}

And here is a usage example:

var source = Observable
    .Interval(TimeSpan.FromMilliseconds(300))
    .Take(4)
    .Select(x => Observable
        .Interval(TimeSpan.FromMilliseconds(200))
        .Select(y => $"{x + 1}-{(char)(65 + y)}")
        .Take(3));

var results = await source.MergeOrdered(2).ToArray();
Console.WriteLine($"Results: {String.Join(", ", results)}");

Output (undesirable):

Results: 1-A, 1-B, 2-A, 1-C, 2-B, 3-A, 2-C, 3-B, 4-A, 3-C, 4-B, 4-C

The desirable output is:

Results: 1-A, 1-B, 1-C, 2-A, 2-B, 2-C, 3-A, 3-B, 3-C, 4-A, 4-B, 4-C

Clarification: Regarding the ordering of the values, the values themselves are irrelevant. What matters is the order of their originated inner sequence, and their position in that sequence. All values from the first inner sequence should be emitted first (in their original order), then all the values from the second inner sequence, then all the values from the third, etc.


回答1:


There's no way for this observable to know if the last value of any of the inner observables will be the first value that should be produced.

As an example, you could have this:

Stream of observables: +--1---2---3--|
Observable-1         :    +------------B--------A-|
Observable-2         :        +--C--------D-|
Observable-3         :            +-E--------F-|
Desirable merging    : +------------------------ABCDEF|

In this case, I'd do this:

IObservable<char> query =
    sources
        .ToObservable()
        .Merge()
        .ToArray()
        .SelectMany(xs => xs.OrderBy(x => x));



回答2:


I figured out a solution to this problem, by warming (publishing) the inner sequences in a controllable manner. This solution uses the Replay operator for controlling the temperature, and a SemaphoreSlim for controlling the concurrency. The final Concat operator ensures that the values of each inner sequence will be emitted in the desirable order (sequentially).

/// <summary>
/// Merges elements from all inner observable sequences into a single observable
/// sequence, preserving the order of the elements based on the order of their
/// originated sequence, limiting the number of concurrent subscriptions to inner
/// sequences.
/// </summary>
public static IObservable<T> MergeOrdered<T>(
    this IObservable<IObservable<T>> source,
    int maximumConcurrency = Int32.MaxValue)
{
    return Observable.Defer(() =>
    {
        var semaphore = new SemaphoreSlim(maximumConcurrency);
        return source.Select(inner =>
        {
            var published = inner.Replay();
            _ = semaphore.WaitAsync().ContinueWith(_ => published.Connect(),
                TaskScheduler.Default);
            return published.Finally(() => semaphore.Release());
        })
        .Concat();
    });
}

The Defer operator is used in order to have a different SemaphoreSlim for each subscription (reference). Using the same SemaphoreSlim with multiple subscriptions could be problematic.

This is not a perfect solution because there is no reason for the inner sequence currently subscribed by the Concat to be published. Optimizing this inefficiency in not trivial though, so I'll leave it as is.



来源:https://stackoverflow.com/questions/64841312/how-to-merge-multiple-observables-with-order-preservation-and-maximum-concurrenc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!