Why allow concatenation of string literals?

前端 未结 10 580
借酒劲吻你
借酒劲吻你 2020-11-30 14:47

I was recently bitten by a subtle bug.

char ** int2str = {
   \"zero\", // 0
   \"one\",  // 1
   \"two\"   // 2
   \"three\",// 3
   nullptr };

assert( int         


        
相关标签:
10条回答
  • 2020-11-30 15:26

    For rationale, expanding and simplifying Shafik Yaghmour’s answer: string literal concatenation originated in C (hence inherited by C++), as did the term, for two reasons (references are from Rationale for the ANSI C Programming Language):

    • For formatting: to allow long string literals to span multiple lines with proper indentation – in contrast to line continuation, which destroys the indentation scheme (3.1.4 String literals); and
    • For macro magic: to allow the construction of string literals by macros (via stringizing) (3.8.3.2 The # operator).

    It is included in the modern languages Python and D because they copied it from C, though in both of these it has been proposed for deprecation, as it is bug-prone (as you note) and unnecessary (since one can just have a concatenation operator and constant folding for compile-time evaluation; you can’t do this in C because strings are pointers, and so you can’t add them).

    It’s not simple to remove because that breaks compatibility, and you have to be careful about precedence (implicit concatenation happens during lexing, prior to operators, but replacing this with an operator means you need to be careful about precedence), hence why it’s still present.

    Yes, it is in used production code. Google Python Style Guide: Line length specifies:

    When a literal string won't fit on a single line, use parentheses for implicit line joining.

    x = ('This will build a very long long '
         'long long long long long long string')
    

    See “String literal concatenation” at Wikipedia for more details and references.

    0 讨论(0)
  • 2020-11-30 15:28

    I see several C and C++ answers but none of the really answer why or really what was the rationale for this feature? In C++ this is feature comes from C99 and we can find the rationale for this feature by going to Rationale for International Standard—Programming Languages—C section 6.4.5 String literals which says (emphasis mine):

    A string can be continued across multiple lines by using the backslash–newline line continuation, but this requires that the continuation of the string start in the first position of the next line. To permit more flexible layout, and to solve some preprocessing problems (see §6.10.3), the C89 Committee introduced string literal concatenation. Two string literals in a row are pasted together, with no null character in the middle, to make one combined string literal. This addition to the C language allows a programmer to extend a string literal beyond the end of a physical line without having to use the backslash–newline mechanism and thereby destroying the indentation scheme of the program. An explicit concatenation operator was not introduced because the concatenation is a lexical construct rather than a run-time operation.

    Python which seems to have the same reason, this reduces the need for ugly \ to continue long string literals. Which is covered in section 2.4.2 String literal concatenation of the The Python Language Reference.

    0 讨论(0)
  • 2020-11-30 15:28

    It's a great feature that allows you to combine preprocessor strings with your strings.

    // Here we define the correct printf modifier for time_t
    #ifdef TIME_T_LONG
        #define TIME_T_MOD "l"
    #elif defined(TIME_T_LONG_LONG)
        #define TIME_T_MOD "ll"
    #else
        #define TIME_T_MOD ""
    #endif
    
    // And he we merge the modifier into the rest of our format string
    printf("time is %" TIME_T_MOD "u\n", time(0));
    
    0 讨论(0)
  • 2020-11-30 15:33

    While people have taken the words out of my mouth about the practical uses of the feature, nobody has so far tried to defend the choice of syntax.

    For all I know, the typo that can slip through as a result was probably just overlooked. After all, it seems robustness against typos wasn't at the front of Dennis's mind, as shown further by:

    if (a = b);
    {
        printf("%d", a);
    }
    

    Furthermore, there's the possible view that it wasn't worth using up an extra symbol for concatenation of string literals - after all, there isn't much else that can be done with two of them, and having a symbol there might create temptation to try to use it for runtime string concatenation, which is above the level of C's built-in features.

    Some modern, higher-level languages based on C syntax have discarded this notation presumably because it is typo-prone. But these languages have an operator for string concatenation, such as + (JS, C#), . (Perl, PHP), ~ (D, though this has also kept C's juxtaposition syntax), and constant folding (in compiled languages, anyway) means that there is no runtime performance overhead.

    0 讨论(0)
  • 2020-11-30 15:36

    So that you can split long string literals across lines.

    And yes, I've seen it in production code.

    0 讨论(0)
  • 2020-11-30 15:38

    Cases where this can be useful:

    • Generating strings including components defined by the preprocessor (this is perhaps the largest use case in C, and it's one I see very, very frequently).
    • Splitting string constants over multiple lines

    To provide a more concrete example for the former:

    // in version.h
    #define MYPROG_NAME "FOO"
    #define MYPROG_VERSION "0.1.2"
    
    // in main.c
    puts("Welcome to " MYPROG_NAME " version " MYPROG_VERSION ".");
    
    0 讨论(0)
提交回复
热议问题