Do 'Intermediate IObservables' without final subscribers get kept in memory for the lifetime of the root IObservable

问题

For example, consider this:

    public IDisposable Subscribe<T>(IObserver<T> observer)
    {
        return eventStream.Where(e => e is T).Cast<T>().Subscribe(observer);
    }

The eventStream is a long lived source of events. A short lived client will use this method to subscribe for some period of time, and then unsubscribe by calling Dispose on the returned IDisposable.

However, while the eventStream still exists and should be kept in memory, there has been 2 new IObservables created by this method - the one returned by the Where() method that is presumably held in memory by the eventStream, and the one returned by the Cast<T>() method that is presumably held in memory by the one returned by the Where() method.

How will these 'intermediate IObservables' (is there a better name for them?) get cleaned up? Or will they now exist for the lifetime of the eventStream even though they no longer have subscriptions and no one else references them except for their source IObservable and therefor will never have subscriptions again?

If they are cleaned up by informing their parent they no longer have subscriptions, how do they know nothing else has taken a reference to them and may at some point later subscribe to them?

回答1:

However, while the eventStream still exists and should be kept in memory, there has been 2 new IObservables created by this method - the one returned by the Where() method that is presumably held in memory by the eventStream, and the one returned by the Cast() method that is presumably held in memory by the one returned by the Where() method.

You have this backward. Let's walk through the chain of what is going on.

IObservable<T> eventStream; //you have this defined and assigned somewhere

public IDisposable Subscribe<T>(IObserver<T> observer)
{
    //let's break this method into multiple lines

    IObservable<T> whereObs = eventStream.Where(e => e is T);
    //whereObs now has a reference to eventStream (and thus will keep it alive), 
    //but eventStream knows nothing of whereObs (thus whereObs will not be kept alive by eventStream)
    IObservable<T> castObs = whereObs.Cast<T>();
    //as with whereObs, castObs has a reference to whereObs,
    //but no one has a reference to castObs
    IDisposable ret = castObs.Subscribe(observer);
    //here is where it gets tricky.
    return ret;
}

What ret does or does not have a reference to depends on the implementation of the various observables. From what I have seen in Reflector in the Rx library and the operators I have written myself, most operators do not return disposables that have a reference to the operator observable itself.

For example, a basic implementation of Where would be something like (typed directly in the editor, no error handling)

IObservable<T> Where<T>(this IObservable<T> source, Func<T, bool> filter)
{
    return Observable.Create<T>(obs =>
      {
         return source.Subscribe(v => if (filter(v)) obs.OnNext(v),
                                 obs.OnError, obs.OnCompleted);
      }
}

Notice that the disposable returned will have a reference to the filter function via the observer that is created, but will not have a reference to the Where observable. Cast can be easily implemented using the same pattern. In essence, the operators become observer wrapper factories.

The implication of all this to the question at hand is that the intermediate IObservables are eligible for garbage collection by the end of the method. The filter function passed to Where stays around as long as the subscription does, but once the subscription is disposed or completed, only eventStream remains (assuming it is still alive).

EDIT for supercat's comment, let's look at how the compiler might rewrite this or how you would implement this without closures.

class WhereObserver<T> : IObserver<T>
{
    WhereObserver<T>(IObserver<T> base, Func<T, bool> filter)
    {
        _base = base;
        _filter = filter;
    }

    IObserver<T> _base;
    Func<T, bool> _filter;

    void OnNext(T value)
    {
        if (filter(value)) _base.OnNext(value);
    }

    void OnError(Exception ex) { _base.OnError(ex); }
    void OnCompleted() { _base.OnCompleted(); }
}

class WhereObservable<T> : IObservable<T>
{
    WhereObservable<T>(IObservable<T> source, Func<T, bool> filter)
    {
        _source = source;
        _filter = filter;
    }

    IObservable<T> source;
    Func<T, bool> filter;

    IDisposable Subscribe(IObserver<T> observer)
    {
        return source.Subscribe(new WhereObserver<T>(observer, filter));
    }
}

static IObservable<T> Where(this IObservable<T> source, Func<T, bool> filter)
{
    return new WhereObservable(source, filter);
}

You can see that the observer does not need any reference to the observable that generated it and the observable has no need to track the observers it creates. We didn't even make any new IDisposable to return from our subscribe.

In reality, Rx has some actual classes for anonymous observable/observer that take delegates and forward the interface calls to those delegates. It uses closures to create those delegates. The compiler does not need to emit classes that actually implement the interfaces, but the spirit of the translation remains the same.

回答2:

I think I've come to the conclusion with the help of Gideon's answer and breaking down a sample Where method:

I assumed incorrectly that each downstream IObservable was referenced by the upstream at all times (in order to push events down when needed). But this would root downstreams in memory for the lifetime of the upstream.

In fact, each upstream IObservable is referenced by the downstream IObservable (waiting, ready to hook an IObserver when required). This roots upstreams in memory as long as the downstream is referenced (which makes sense, as while a downstream in still referenced somewhere, a subscription may occur at any time).

However when a subscription does occur, this upstream to downstream reference chain does get formed, but only on the IDisposable implementation objects that manage the subscriptions at each observable stage, and only for the lifetime of that subscription. (which also makes sense - while a subscription exists, each upstream 'processing logic' must still be held in memory to handle the events being passed through to reach the final subscriber IObserver).

This gives a solution to both problems - while an IObservable is referenced, it will hold all source (upstream) IObservables in memory, ready for a subscription. And while a subscription exists, it will hold all downstream subscriptions in memory, allowing the final subscription to still receive events even though it's source IObservable may no longer be referenced.

Applying this to my example in my question, the Where and Cast downstream observables are very short lived - referenced up until the Subscribe(observer) call completes. They are then free to be collected. The fact that the intermediate observables may now be collected does not cause a problem for the subscription just created, as it has formed it's own subscription object chain (upstream -> downstream) that is rooted by the source eventStream observable. This chain will be released as soon as each downstream stage disposes its IDisposable subscription tracker.

回答3:

You need to remember that IObserable<T> (like IEnumerable<T>) are lazy lists. They don't exist until someone tries to access the elements by subscribing or iterating.

When you write list.Where(x => x > 0) you are not creating a new list, you are merely defining what the new list will look like if someone tries to access the elements.

This is a very important distinction.

You can consider that there are two different IObservables. One is the definition and the subscribed instances.

The IObservable definitions use next to no memory. References can be freely shared. They will be cleanly garbage collected.

The subscribed instances only exist if someone is subscribed. They may use considerable memory. Unless you use the .Publish extensions you can't share references. When the subscription ends or is terminated by calling .Dispose() the memory is cleaned up.

A new set of subscribed instances are created for every new subscription. When the final child subscription is disposed the whole chain is disposed. They can't be shared. If there is a second subscription a complete chain of subscribed instances are created, independent of the first.

I hope this helps.

回答4:

A class implementing IObservable is just a regular object. It will get cleaned up when the GC runs and does not see any references to it. It isn't anything other than "when does new object() get cleaned up". Except for memory use, whether they get cleaned up should not be visible to your program.

回答5:

If an object subscribes to events, whether for its own use, or for the purpose of forwarding them to other objects, the publisher of those events will generally keep it alive even if nobody else will. If I'm understanding your situation correctly, you have objects which subscribe to events for the purpose of forwarding them to zero or more other subscribers. I would suggest that you should if possible design your intermediate IObservables so that they will not subscribe to an event from their parent until someone subscribes to an event from them, and they will unsubscribe from their parent's event any time their last subscriber unsubscribes. Whether or not this is practical will depend upon the threading contexts of the parent and child IObservables. Further note that (again depending upon threading context) locking may be required to deal with the case where a new subscriber joins at about the same time as (what would have been) the last subscriber quits. Even though most objects' subscription and unsubscription scenarios could be handled using CompareExchange rather than locking, that is often unworkable in scenarios involving interconnected subscription lists.

If your object will receive subscriptions and unsubscriptions from its children in a threading context which is not compatible with the parent's subscription and unsubscription methods (IMHO, IObservable should have required that all legitimate implementations allow subscription and unsubscription from arbitrary threading context, but alas it does not) you may have no choice but to have the intermediate IObservable, immediately upon creation, create a proxy object to handle subscriptions on your behalf, and have that object subscribe to the parent's event. Then have your own object (to which the proxy would have only a weak reference) include a finalizer which will notify the proxy that it will need to unsubscribe when its parent's threading context permits. It would be nice to have your proxy object unsubscribe when its last subscriber quits, but if a new subscriber might join and expect its subscription to be valid immediately, one may have to keep the proxy subscribed as long as anyone holds a reference to the intermediate observer which could be used to request a new subscription.

来源：https://stackoverflow.com/questions/9737711/do-intermediate-iobservables-without-final-subscribers-get-kept-in-memory-for

标签

.net

system.reactive

idisposable