Intern string literals misunderstanding?

前端 未结 2 1000
走了就别回头了
走了就别回头了 2020-12-03 23:18

I dont understand :

MSDN says

http://msdn.microsoft.com/en-us/library/system.string.intern.aspx

Consequently, an instance of a litera

相关标签:
2条回答
  • 2020-12-04 00:06

    String literals get interned automatically (so, if your code contains "lalala" 1000 times, only one instance will exist).

    Such strings will not get GC'd and any time they are referenced the reference will be the interned one.


    string.Intern is there for strings that are not literals - say from user input or read from a file or database and that you know will be repeated very often and as such are worth interning for the lifetime of the process.

    0 讨论(0)
  • 2020-12-04 00:12

    Interning is something that happens behind the scenes, so you as a programmer never have to worry about it. You generally do not have to put anything to the pool, or get anything from the pool. Like garbage collection: you never have to invoke it, or worry that it may happen, or worry that it may not happen. (Well, in 99.999% of the cases. And the remaining 0.001 percent is when you are doing very weird stuff.)

    The compiler takes care of interning all string literals that are contained within your source file, so "lalala" will be interned without you having to do anything, or having any control over the matter. And whenever you refer to "lalala" in your program, the compiler makes sure to fetch it from the intern pool, again without you having to do anything, nor having any control over the matter.

    The intern pool contains a more-or-less fixed number of strings, generally of a very small size, (only a fraction of the total size of your .exe,) so it does not matter that they never get garbage-collected.


    EDIT

    The purpose of interning strings is to greatly improve the execution time of certain string operations like Equals(). The Equals() method of String first checks whether the strings are equal by reference, which is extremely fast; if the references are equal, then it returns true immediately; if the references are not equal, and the strings are both interned, then it returns false immediately, because they cannot possibly be equal, since all strings in the intern pool are different from each other. If none of the above holds true, then it proceeds with a character by character string comparison. (Actually, it is even more complicated than that, because it also checks the hashcodes of the strings, but let's keep things simple in this discussion.)

    So, suppose that you are reading tokens from a file in string s, and you have a switch statement of the following form:

    switch( s )
    {
        case "cat": ....
        case "dog": ....
        case "tod": ....
    }
    

    The string literals "cat", "dog", "tod" have all been interned, but you are comparing each and every one of them against s, which has not been interned, so you are not reaping the benefits of the intern pool. If you intern s right before the switch statement, then the comparisons that will be done by the switch statement will be a lot faster.

    Of course, if there is any possibility that your file might contain garbage, then you do NOT want to do this, because loading lots of random strings into the intern pool is sure to kill the performance of your program, and eventually run out of memory.

    0 讨论(0)
提交回复
热议问题