问题
I am trying to understand string interning and why is doesn't seem to work in my example. The point of the example is to show Example 1 uses less (a lot less memory) as it should only have 10 strings in memory. However, in the code below both example use roughly the same amount of memory (virtual size and working set).
Please advice why example 1 isn't using a lot less memory? Thanks
Example 1:
IList<string> list = new List<string>(10000);
for (int i = 0; i < 10000; i++)
{
for (int k = 0; k < 10; k++)
{
list.Add(string.Intern(k.ToString()));
}
}
Console.WriteLine("intern Done");
Console.ReadLine();
Example 2:
IList<string> list = new List<string>(10000);
for (int i = 0; i < 10000; i++)
{
for (int k = 0; k < 10; k++)
{
list.Add(k.ToString());
}
}
Console.WriteLine("intern Done");
Console.ReadLine();
回答1:
From the msdn Second, to intern a string, you must first create the string. The memory used by the String object must still be allocated, even though the memory will eventually be garbage collected.
回答2:
The problem is that ToString() will still allocate a new string, and then intern it. If the garbage collector doesn't run to collect those "temporary" strings, then the memory usage will be the same.
Also, the length of your strings are pretty short. 10,000 strings that are mostly only one character long is a memory difference of about 20KB which you're probably not going to notice. Try using longer strings (or a lot more of them) and doing a garbage collect before you check the memory usage.
Here is an example that does show a difference:
class Program
{
static void Main(string[] args)
{
int n = 100000;
if (args[0] == "1")
WithIntern(n);
else
WithoutIntern(n);
}
static void WithIntern(int n)
{
var list = new List<string>(n);
for (int i = 0; i < n; i++)
{
for (int k = 0; k < 10; k++)
{
list.Add(string.Intern(new string('x', k * 1000)));
}
}
GC.Collect();
Console.WriteLine("Done.");
Console.ReadLine();
}
static void WithoutIntern(int n)
{
var list = new List<string>(n);
for (int i = 0; i < n; i++)
{
for (int k = 0; k < 10; k++)
{
list.Add(new string('x', k * 1000));
}
}
GC.Collect();
Console.WriteLine("Done.");
Console.ReadLine();
}
}
回答3:
Remember, the CLR manages memory on behalf of your process, so it is really hard to figure out the managed memory footprint from looking at virtual size and working set. The CLR will generally allocate and free memory in chunks. The size of these varies according to implementation details, but due to this it is next to impossible to measure managed heap usage based on memory counters for the process.
However, if you look at the actual memory usage for the examples you'll see a difference.
Example 1
0:005>!dumpheap -stat
...
00b6911c 137 4500 System.String
0016be60 8 480188 Free
00b684c4 14 649184 System.Object[]
Total 316 objects
0:005> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01592dcc
generation 1 starts at 0x01592dc0
generation 2 starts at 0x01591000
ephemeral segment allocation context: none
segment begin allocated size
01590000 01591000 01594dd8 0x00003dd8(15832)
Large object heap starts at 0x02591000
segment begin allocated size
02590000 02591000 026a49a0 0x001139a0(1128864)
Total Size 0x117778(1144696)
------------------------------
GC Heap Size 0x117778(1144696)
Example 2
0:006> !dumpheap -stat
...
00b684c4 14 649184 System.Object[]
00b6911c 100137 2004500 System.String
Total 100350 objects
0:006> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x0179967c
generation 1 starts at 0x01791038
generation 2 starts at 0x01591000
ephemeral segment allocation context: none
segment begin allocated size
01590000 01591000 0179b688 0x0020a688(2139784)
Large object heap starts at 0x02591000
segment begin allocated size
02590000 02591000 026a49a0 0x001139a0(1128864)
Total Size 0x31e028(3268648)
------------------------------
GC Heap Size 0x31e028(3268648)
As you can see from the output above the second example does use more memory on the managed heap.
回答4:
Source: https://blogs.msdn.microsoft.com/ericlippert/2009/09/28/string-interning-and-string-empty/
String interning is an optimization technique by the compiler. If you have two identical string literals in one compilation unit then the code generated ensures that there is only one string object created for all the instance of that literal(characters enclosed in double quotes) within the assembly.
Example:
object obj = "Int32";
string str1 = "Int32";
string str2 = typeof(int).Name;
output of the following comparisons:
Console.WriteLine(obj == str1); // true
Console.WriteLine(str1 == str2); // true
Console.WriteLine(obj == str2); // false !?
Note1: Objects are compared by reference.
Note2: typeof(int).Name is evaluated by reflection method so it does not gets evaluated at compile time. Here these comparisons are made at compile time.
Analysis of the Results:
true because they both contain same literal and so the code generated will have only one object referencing "Int32". See Note 1.
true because the content of both the value is checked which is same.
false because str2 and obj does not have the same literal. See Note 2.
来源:https://stackoverflow.com/questions/2506588/c-sharp-string-interning