I distinctly remember from the early days of .NET that calling ToString on a StringBuilder used to provide the new string object (to be returned) with the internal char buff
That was most likely just an implementation detail, rather than a documented constraint on the interface provided by StringBuilder.ToString
. The fact that you feel unsure if it was ever documented might suggest this is the case.
Books will often detail implementations to show some insight into how to use something, but most carry a warning that the implementation is subject to change.
A good example of why one should never rely on implementation details.
I suspect that it wasn't a feature to have the builder become immutable, but merely a side-effect of the implementation of ToString
.
I hadn't seen this before, so here's my guess: the internal storage of a StringBuilder
appears to no longer be a simple string
, but a set of 'chunks'. ToString
can't return a reference to this internal string because it no longer exists.
(Are version 4.0 StringBuilders now ropes?)
Yes, you remember correctly. The StringBuilder.ToString
method used to return the internal buffer as the string, and flag it as used so that additional changes to the StringBuilder
had to allocate a new buffer.
As this is an implementation detail, it's not mentioned in the documentation. This is why they can change the underlying implementation without breaking anything in the defined behaviour of the class.
As you see from the code posted, there is not a single internal buffer any more, instead the characters are stored in chunks, and the ToString
method puts the chunks together into a string.
The reason for this change in implementation is likely that they have gathered information about how the StringBuilder
class is actually used, and come to the conclusion that this approach gives a better performance weighed between average and worst case situations.
Yup, this has been completely redesigned for .NET 4.0. It now uses a rope, a linked list of string builders to store the growing internal buffer. This is a workaround for a problem when you can't guess the initial Capacity well and the amount of text is large. That creates a lot of copies of the dis-used internal buffer, clogging up the Large Object Heap. This comment from the source code as available from the Reference Source is relevant:
// We want to keep chunk arrays out of large object heap (< 85K bytes ~ 40K chars) to be sure.
// Making the maximum chunk size big means less allocation code called, but also more waste
// in unused characters and slower inserts / replaces (since you do need to slide characters over
// within a buffer).
internal const int MaxChunkSize = 8000;
Here is the .NET 1.1 implementation of StringBuilder.ToString
from Reflector:
public override string ToString()
{
string stringValue = this.m_StringValue;
int currentThread = this.m_currentThread;
if ((currentThread != 0) && (currentThread != InternalGetCurrentThread()))
{
return string.InternalCopy(stringValue);
}
if ((2 * stringValue.Length) < stringValue.ArrayLength)
{
return string.InternalCopy(stringValue);
}
stringValue.ClearPostNullChar();
this.m_currentThread = 0;
return stringValue;
}
As far as I can see it will in some cases return the string without copying it. However, I don't think the StringBuilder
becomes immutable. Instead I think it will use copy-on-write if you continue to write to the StringBuilder
.