I used reflection to look at the internal fields of System.String and I found three fields:
m_arrayLength
m_stringLength
m_firstChar
I don\'t
Much of the implementation of System.String
is in native code (C/C++) and not in managed code (C#). If you take a look at the decompiled code you'll see that most of the "interesting" or "core" methods are decorated with this attribute:
[MethodImpl(MethodImplOptions.InternalCall)]
Only some of the helper/convenience APIs are implemented in C#.
So where are the characters for the string stored? It's top secret! Deep down inside the CLR's core native code implementation.
The first char provides access (via &m_firstChar
) to an address in memory of the first character in the buffer. The length tells it how many characters are in the string
, making .Length
efficient (better than looking for a nul
char). Note that strings can be oversized (especially if created with StringBuilder
, and a few other scenarios), so sometimes the actual buffer is actually longer than the string. So it is important to track this. StringBuilder, for example, actually mutates a string within its buffer, so it needs to know how much it can add before having to create a larger buffer (see AppendInPlace
, for example).
Correct answer on difference between string and System.string is here: string vs System.String
There is nothing about native implementations
I'd be thinking immediately that m_firstChar
is not the first character, rather a pointer to the first character. That would make much more sense (although, since I'm not privy to the source, I can't be certain).
It makes little sense to store the first character of a string unless you want a blindingly fast s.substring(0,1)
operation :-) There's a good chance the characters themselves (that the three fields allude to) will be allocated separately from the actual object.