问题
Recently I stumbled upon an interesting statement by Enigmativity about the Publish and RefCount operators:
You're using the dangerous .Publish().RefCount() operator pair which creates a sequence that can't be subscribed to after it completes.
This statement seems to oppose Lee Campbell's assessment about these operators. Quoting from his book Intro to Rx:
The Publish/RefCount pair is extremely useful for taking a cold observable and sharing it as a hot observable sequence for subsequent observers.
Initially I didn't believe that Enigmativity's statement was correct, so I tried to refute it. My experiments revealed that the Publish().RefCount()
can be
indeed inconsistent. Subscribing a second time to a published sequence can cause a new subscription to the source sequence, or not, depending on whether the source sequence was completed while connected. If it was completed, then it won't be resubscribed. If it was not completed, then it will be resubscribed. Here is a demonstration of this behavior:
var observable = Observable
.Create<int>(o =>
{
o.OnNext(13);
o.OnCompleted(); // Commenting this line alters the observed behavior
return Disposable.Empty;
})
.Do(x => Console.WriteLine($"Producer generated: {x}"))
.Finally(() => Console.WriteLine($"Producer finished"))
.Publish()
.RefCount()
.Do(x => Console.WriteLine($"Consumer received #{x}"))
.Finally(() => Console.WriteLine($"Consumer finished"));
observable.Subscribe().Dispose();
observable.Subscribe().Dispose();
In this example the observable
is composed by three parts. First is the producing part that generates a single value and then completes. Then follows the publishing mechanism (Publish
+RefCount
). And finally comes the consuming part that observes the values emitted by the producer. The observable
is subscribed twice. The expected behavior would be that each subscription will receive one value. But this is not what happens! Here is the output:
Producer generated: 13
Consumer received #13
Producer finished
Consumer finished
Consumer finished
(Try it on fiddle)
And here is the output if we comment the o.OnCompleted();
line. This subtle change results to a behavior that is expected and desirable:
Producer generated: 13
Consumer received #13
Producer finished
Consumer finished
Producer generated: 13
Consumer received #13
Producer finished
Consumer finished
In the first case the cold producer (the part before the Publish().RefCount()
) was subscribed only once. The first consumer received the emitted value, but the second consumer received nothing (except from an OnCompleted
notification). In the second case the producer was subscribed twice. Each time it generated a value, and each consumer got one value.
My question is: how can we fix this? How can we modify either the Publish
operator, or the RefCount
, or both, in order to make them behave always consistently and desirably? Below are the specifications of the desirable behavior:
- The published sequence should propagate to its subscribers all notifications coming directly from the source sequence, and nothing else.
- The published sequence should subscribe to the source sequence when its current number of subscribers increases from zero to one.
- The published sequence should stay connected to the source as long as it has at least one subscriber.
- The published sequence should unsubscribe from the source when its current number of subscribers becomes zero.
I am asking for either a custom PublishRefCount
operator that offers the functionality described above, or for a way to achieve the desirable functionality using the built-in operators.
Btw a similar question exists, that asks why this happens. My question is about how to fix it.
Update: In retrospect, the above specification results to an unstable behavior that makes race-conditions unavoidable. There is no guarantee that two subscriptions to the published sequence will result to a single subscription to the source sequence. The source sequence may complete between the two subscriptions, causing the unsubscription of the first subscriber, causing the unsubscription of the RefCount
operator, causing a new subscription to the source for the next subscriber. The behavior of the built-in .Publish().RefCount()
prevents this from happening.
The moral lesson is that the .Publish().RefCount()
sequence is not broken, but it's not reusable. It cannot be used reliably for multiple connect/disconnect sessions. If you want a second session, you should create a new .Publish().RefCount()
sequence.
回答1:
Lee does a good job explaining IConnectableObservable
, but Publish
isn't explained that well. It's a pretty simple animal, just hard to explain. I'll assume you understand IConnectableObservable
:
If we to re-implement the zero-param Publish
function simply and lazily, it would look something like this:
// For illustrative purposes only: don't use this code
public class PublishObservable<T> : IConnectableObservable<T>
{
private readonly IObservable<T> _source;
private readonly Subject<T> _proxy = new Subject<T>();
private IDisposable _connection;
public PublishObservable(IObservable<T> source)
{
_source = source;
}
public IDisposable Connect()
{
if(_connection == null)
_connection = _source.Subscribe(_proxy);
var disposable = Disposable.Create(() =>
{
_connection.Dispose();
_connection = null;
});
return _connection;
}
public IDisposable Subscribe(IObserver<T> observer)
{
var _subscription = _proxy.Subscribe(observer);
return _subscription;
}
}
public static class X
{
public static IConnectableObservable<T> Publish<T>(this IObservable<T> source)
{
return new PublishObservable<T>(source);
}
}
Publish
creates a single proxy Subject
which subscribes to the source observable. The proxy can subscribe/unsubscribe to source based on the connection: Call Connect
, and proxy subscribes to source. Call Dispose
on the connection disposable and the proxy unsubscribes from source. The important think to take-away from this is that there is a single Subject
that proxies any connection to the source. You're not guaranteed only one subscription to source, but you are guaranteed one proxy and one concurrent connection. You can have multiple subscriptions via connecting/disconnecting.
RefCount
handles the calling Connect
part of things: Here's a simple re-implementation:
// For illustrative purposes only: don't use this code
public class RefCountObservable<T> : IObservable<T>
{
private readonly IConnectableObservable<T> _source;
private IDisposable _connection;
private int _refCount = 0;
public RefCountObservable(IConnectableObservable<T> source)
{
_source = source;
}
public IDisposable Subscribe(IObserver<T> observer)
{
var subscription = _source.Subscribe(observer);
var disposable = Disposable.Create(() =>
{
subscription.Dispose();
DecrementCount();
});
if(++_refCount == 1)
_connection = _source.Connect();
return disposable;
}
private void DecrementCount()
{
if(--_refCount == 0)
_connection.Dispose();
}
}
public static class X
{
public static IObservable<T> RefCount<T>(this IConnectableObservable<T> source)
{
return new RefCountObservable<T>(source);
}
}
A bit more code, but still pretty simple: Call Connect
on the ConnectableObservable
if refcount goes up to 1, disconnect if it goes down to 0.
Put the two together, and you get a pair that guarantee that there will only be one concurrent subscription to a source observable, proxied through one persistent Subject
. The Subject
will only be subscribed to the source while there is >0 downstream subscriptions.
Given that introduction, there's a lot of misconceptions in your question, so I'll go over them one by one:
... Publish().RefCount() can be indeed inconsistent. Subscribing a second time to a published sequence can cause a new subscription to the source sequence, or not, depending on whether the source sequence was completed while connected. If it was completed, then it won't be resubscribed. If it was not completed, then it will be resubscribed.
.Publish().RefCount()
will subscribe anew to source under one condition only: When it goes from zero subscribers to 1. If the count of subscribers goes from 0 to 1 to 0 to 1 for any reason then you will end up re-subscribing. The source observable completing will cause RefCount
to issue an OnCompleted
, and all of its observers unsubscribe. So subsequent subscriptions to RefCount
will trigger an attempt to resubscribe to source. Naturally if source is observing the observable contract properly it will issue an OnCompleted
immediately and that will be that.
[see sample observable with OnCompleted...] The observable is subscribed twice. The expected behavior would be that each subscription will receive one value.
No. The expected behavior is that the proxy Subject
after issuing an OnCompleted
will re-emit an OnCompleted
to any subsequent subscription attempt. Since your source observable completes synchronously at the end of your first subscription, the second subscription will be attempting to subscribe to a Subject
that has already issued an OnCompleted
. This should result in an OnCompleted
, otherwise the Observable contract would be broken.
[see sample observable without OnCompleted as second case...] In the first case the cold producer (the part before the Publish().RefCount()) was subscribed only once. The first consumer received the emitted value, but the second consumer received nothing (except from an OnCompleted notification). In the second case the producer was subscribed twice. Each time it generated a value, and each consumer got one value.
This is correct. Since the proxy Subject
never completed, subsequent re-subscriptions to source will result in the cold observable re-running.
My question is: how can we fix this? [..]
- The published sequence should propagate to its subscribers all notifications coming directly from the source sequence, and nothing else.
- The published sequence should subscribe to the source sequence when its current number of subscribers increases from zero to one.
- The published sequence should stay connected to the source as long as it has at least one subscriber.
- The published sequence should unsubscribe from the source when its current number of subscribers become zero.
All of the above currently happens with .Publish
and .RefCount
currently as long as you don't complete/error. I don't suggest implementing an operator that changes that, breaking the Observable contract.
EDIT:
I would argue the #1 source of confusion with Rx is Hot/Cold observables. Since Publish
can 'warm-up' cold observables, it's no surprise that it should lead to confusing edge cases.
First, a word on the observable contract. The Observable contract stated more succinctly is that an OnNext
can never follow an OnCompleted
/OnError
, and there should be only one OnCompleted
or OnError
notification. This does leave the edge case of attempts to subscribe to terminated observables:
Attempts to subscribe to terminated observables result in receiving the termination message immediately. Does this break the contract? Perhaps, but it's the only contract cheat, to my knowledge, in the library. The alternative is a subscription to dead air. That doesn't help anybody.
How does this tie into hot/cold observables? Unfortunately, confusingly. A subscription to an ice-cold observable triggers a re-construction of the entire observable pipeline. This means that subscribe-to-already-terminated rule only applies to hot observables. Cold observables always start anew.
Consider this code, where o
is a cold observable.:
var o = Observable.Interval(TimeSpan.FromMilliseconds(100))
.Take(5);
var s1 = o.Subscribe(i => Console.WriteLine(i.ToString()));
await Task.Delay(TimeSpan.FromMilliseconds(600));
var s2 = o.Subscribe(i => Console.WriteLine(i.ToString()));
For the purposes of the contract, the observable behind s1
and observable behind s2
are entirely different. So even though there's a delay between them, and you'll end up seeing OnNext
after OnCompleted
, that's not a problem, because they are entirely different observables.
Where it get's sticky is with a warmed-up Publish
version. If you were to add .Publish().RefCount()
to the end of o
in the code above...
- Without changing anything else,
s2
would terminate immediately printing nothing. - Change the delay to 400 or so, and
s2
would print the last two numbers. - Change
s1
to only.Take(2)
, ands2
would start over again printing 0 through 4.
Making this nastiness worse, is the Shroedinger's cat effect: If you set up an observer on o
to watch what would happen the whole time, that changes the ref-count, affecting the functionality! Watching it, changes the behavior. Debugging nightmare.
This is the hazard of attempting to 'warm-up' cold observables. It just doesn't work well, especially with Publish/RefCount
.
My advice would be:
- Don't try to warm up cold observables.
- If you need to share a subscription, with either cold or hot observables, stick with @Enigmativity's general rule of strictly using the selector
Publish
version - If you must, have a dummy subscription on a
Publish/RefCount
observable. This at least provides a consistent Refcount >= 1, reducing the quantum activity effect.
回答2:
As Shlomo pointed out, this problem is associated with the Publish
operator. The RefCount
works fine. So it's the Publish
that needs fixing. The Publish
is nothing more than calling the Multicast
operator with a standard Subject<T>
as argument. Here is its source code:
public IConnectableObservable<TSource> Publish<TSource>(IObservable<TSource> source)
{
return source.Multicast(new Subject<TSource>());
}
So the Publish
operator inherits the behavior of the Subject
class. This class, for very good reasons, maintains the state of its completion. So if you signal its completion by calling subject.OnCompleted()
, any future subscribers of the subject will instantly receive an OnCompleted
notification. This feature serves well a standalone subject and its subscribers, but becomes a problematic artifact when a Subject
is used as an intermediate propagator between a source sequence and the subscribers of that sequence. That's because the source sequence already maintains its own state, and duplicating this state inside the subject introduces the risk of the two states becoming out of sync. Which is exactly what happens when the Publish
is combined with the RefCount
operator. The subject remembers that the source has completed, while the source, being a cold sequence, has lost its memory about its previous life and is willing to start a new life afresh.
So the solution is to feed the Multicast
operator with a stateless subject. Unfortunately I can't find a way to compose it based on the built-in Subject<T>
(inheritance is not an option because the class is sealed). Fortunately implementing it from scratch is not very difficult. The implementation below uses an ImmutableArray as storage for the subject's observers, and uses interlocked operations to ensure its thread-safety (much like the built-in Subject<T>
implementation).
public class StatelessSubject<T> : ISubject<T>
{
private IImmutableList<IObserver<T>> _observers
= ImmutableArray<IObserver<T>>.Empty;
public void OnNext(T value)
{
foreach (var observer in Volatile.Read(ref _observers))
observer.OnNext(value);
}
public void OnError(Exception error)
{
foreach (var observer in Volatile.Read(ref _observers))
observer.OnError(error);
}
public void OnCompleted()
{
foreach (var observer in Volatile.Read(ref _observers))
observer.OnCompleted();
}
public IDisposable Subscribe(IObserver<T> observer)
{
ImmutableInterlocked.Update(ref _observers, x => x.Add(observer));
return Disposable.Create(() =>
{
ImmutableInterlocked.Update(ref _observers, x => x.Remove(observer));
});
}
}
Now the Publish().RefCount()
can be fixed by replacing it with this:
.Multicast(new StatelessSubject<SomeType>()).RefCount()
This change results to the desirable behavior. The published sequence is initially cold, becomes hot when it is subscribed for the first time, and becomes cold again when its last subscriber unsubscribes. And the circle continues with no memories of the past events.
Regarding the other normal case that the source sequence completes, the completion is propagated to all subscribers, causing all of them to unsubscribe automatically, causing the published sequence to become cold. The end result is that both sequences, the source and the published, are always in sync. They are either both hot, or both cold.
Here is a StatelessPublish
operator, to make the consumption of the class a little easier.
/// <summary>
/// Returns a connectable observable sequence that shares a single subscription to
/// the underlying sequence, without maintaining its state.
/// </summary>
public static IConnectableObservable<TSource> StatelessPublish<TSource>(
this IObservable<TSource> source)
{
return source.Multicast(new StatelessSubject<TSource>());
}
Usage example:
.StatelessPublish().RefCount()
来源:https://stackoverflow.com/questions/64961330/how-to-fix-the-inconsistency-of-the-publish-refcount-behavior