Resolving Circular References for Objects Implementing ISerializable

I'm writing my own IFormatter implementation and I cannot think of a way to resolve circular references between two types that both implement ISerializable.

Here's the usual pattern:

[Serializable]
class Foo : ISerializable
{
    private Bar m_bar;

    public Foo(Bar bar)
    {
        m_bar = bar;
        m_bar.Foo = this;
    }

    public Bar Bar
    {
        get { return m_bar; }
    }

    protected Foo(SerializationInfo info, StreamingContext context)
    {
        m_bar = (Bar)info.GetValue("1", typeof(Bar));
    }

    public void GetObjectData(SerializationInfo info, StreamingContext context)
    {
        info.AddValue("1", m_bar);
    }
}

[Serializable]
class Bar : ISerializable
{
    private Foo m_foo;

    public Foo Foo
    {
        get { return m_foo; }
        set { m_foo = value; }
    }

    public Bar()
    { }

    protected Bar(SerializationInfo info, StreamingContext context)
    {
        m_foo = (Foo)info.GetValue("1", typeof(Foo));
    }

    public void GetObjectData(SerializationInfo info, StreamingContext context)
    {
        info.AddValue("1", m_foo);
    }
}

I then do this:

Bar b = new Bar();
Foo f = new Foo(b);
bool equal = ReferenceEquals(b, b.Foo.Bar); // true

// Serialise and deserialise b

equal = ReferenceEquals(b, b.Foo.Bar);

If I use an out-of-the-box BinaryFormatter to serialise and deserialise b, the above test for reference-equality returns true as one would expect. But I cannot conceive of a way to achieve this in my custom IFormatter.

In a non-ISerializable situation I can simply revisit "pending" object fields using reflection once the target references have been resolved. But for objects implementing ISerializable it is not possible to inject new data using SerializationInfo.

Can anyone point me in the right direction?

Wesley Hill

This situation is the reason for the FormatterServices.GetUninitializedObject method. The general idea is that if you have objects A and B which reference each other in their SerializationInfo, you can deserialize them as follows:

(For the purposes of this explanation, (SI,SC) refers to a type's deserialization constructor, i.e. the one which takes a SerializationInfo and a StreamingContext.)

Pick one object to deserialize first. It shouldn't matter which you pick, as long as you don't pick one which is a value-type. Lets say you pick A.
Call GetUninitializedObject to allocate (but not initialize) an instance of A's type, because you're not yet ready to call its (SI,SC) constructor.
Build B in the usual way, i.e. create a SerializationInfo object (which will include the reference to the now half-deserialized A) and pass it to B's (SI,SC) constructor.
Now you have all the dependencies you need to initialize your allocated A object. Create it's SerializationInfo object and call A's (SI,SC) constructor. You can call a constructor on an existing instance via reflection.

The GetUninitializedObject method is pure CLR magic - it creates an instance without ever calling a constructor to initialize that instance. It basically sets all fields to zero/null.

This is the reason you are cautioned not to use any of the members of a child object in a (SI,SC) constructor - a child object may be allocated but not yet initialized at that point. It is also the reason for the IDeserializationCallback interface, which gives you a chance to use your child objects after all object initialization is guaranteed to be done and before the deserialized object graph is returned.

The ObjectManager class can do all of this (and other types of fix-ups) for you. However, I've always found it to be quite under-documented given the complexity of deserialization, so I never spent the time to try figure out how to use it properly. It uses some more magic to do step 4 using some internal-to-the-CLR reflection optimized to call the (SI,SC) constructor quicker (I've timed it at about twice as fast as the public way).

Finally, there are object graphs involving cycles which are impossible to deserialize. One example is when you have a cycle of two IObjectReference instances (I've tested BinaryFormatter on this and it throws an exception). Another I suspect is if you have a cycle involving nothing but boxed value-types.

You need to detect that you have used the same object more than once in your object graph, tag each object in the output, and when you come to occurance #2 or higher, you need to output a "reference" to an existing tag instead of the object once more.

Pseudo-code for serialization:

for each object
    if object seen before
        output tag created for object with a special note as "tag-reference"
    else
        create, store, and output tag for object
        output tag and object

Pseudo-code for deserialization:

while more data
    if reference-tag to existing object
        get object from storage keyed by the tag
    else
        construct instance to deserialize into
        store object in storage keyed by deserialized tag
        deserialize object

It is important that you do the last steps there in the order they're specified, so that you can correct handle this case:

SomeObject obj = new SomeObject();
obj.ReferenceToSomeObject = obj;    <-- reference to itself

ie. you cannot store the object into your tag-storage after you've completely deserialized it, since you might need a reference to it in the storage while you are deserializing it.

来源：https://stackoverflow.com/questions/2712393/resolving-circular-references-for-objects-implementing-iserializable

标签

.net

serialization

circular-reference

iserializable