Does Python intern strings?

后端 未结 4 2056
盖世英雄少女心
盖世英雄少女心 2020-11-27 21:35

In Java, explicitly declared Strings are interned by the JVM, so that subsequent declarations of the same String results in two pointers to the same String instance, rather

相关标签:
4条回答
  • 2020-11-27 22:11
    • All length 0 and length 1 strings are interned.
    • Strings are interned at compile time ('wtf' will be interned but ''.join(['w', 't', 'f'] will not be interned)
    • Strings that are not composed of ASCII letters, digits or underscores, are not interned. This explains why 'wtf!' was not interned due to !.

    https://www.codementor.io/satwikkansal/do-you-really-think-you-know-strings-in-python-fnxh8mtha

    The above article explains the string interning in python. There are some exceptions which are defined clearly in the article.

    0 讨论(0)
  • 2020-11-27 22:12

    A fairly easy way to tell is by using id(). However as @MartijnPieters mentions, this is runtime dependent.

    class example():
    
        def __init__(self):
            self._inst = 'instance'
    
    for i in xrange(10):
        print id(example()._inst)
    
    0 讨论(0)
  • 2020-11-27 22:24

    This is called interning, and yes, Python does do this to some extent, for shorter strings created as string literals. See About the changing id of an immutable string for some discussion.

    Interning is runtime dependent, there is no standard for it. Interning is always a trade-off between memory use and the cost of checking if you are creating the same string. There is the sys.intern() function to force the issue if you are so inclined, which documents some of the interning Python does for you automatically:

    Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys.

    Note that Python 2 the intern() function used to be a built-in, no import necessary.

    0 讨论(0)
  • 2020-11-27 22:31

    Some strings are interned in python. As the python code compiled, identifiers are interned e.g. variable names, function names, class names.

    Strings that meet identifier rules which are starts with underscore or string and contains only underscore, string and number, are interned:

    a="hello"
    b="hello"
    

    Since strings are immutable python shares the memory references here and

    a is b ===> True
    

    But if we had

    a="hello world"
    b="hello world"
    

    since "hello world" does not meet the identifier rules, a and b are not interned.

    a is b  ===> False
    

    You can intern those with sys.intern(). use this method if you have a lot of string repetition in your code.

    a=sys.intern("hello world")
    b=sys.intern("hello world")
    

    now a is b ===> True

    0 讨论(0)
提交回复
热议问题