Is the operand of `sizeof` evaluated with a VLA?

后端 未结 3 464
一生所求
一生所求 2020-12-03 07:08

An argument in the comments section of this answer prompted me to ask this question.

In the following code, bar points to a variable length array, so th

相关标签:
3条回答
  • 2020-12-03 07:55

    Two other answers have already quoted N1570 6.5.3.4p2:

    The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

    According to that paragraph from the standard, yes, the operand of sizeof is evaluated.

    I'm going to argue that this is a defect in the standard; something is evaluated at run time, but the operand is not.

    Let's consider a simpler example:

    int len = 100;
    double vla[len];
    printf("sizeof vla = %zu\n", sizeof vla);
    

    According to the standard, sizeof vla evaluates the expression vla. But what does that mean?

    In most contexts, evaluating an array expression yields the address of the initial element -- but the sizeof operator is an explicit exception to that. We might assume that evaluating vla means accessing the values of its elements, which has undefined behavior since those elements have not been initialized. But there is no other context in which evaluation of an array expression accesses the values of its elements, and absolutely no need to do so in this case. (Correction: If a string literal is used to initialize an array object, the values of the elements are evaluated.)

    When the declaration of vla is executed, the compiler will create some anonymous metadata to hold the length of the array (it has to, since assigning a new value to len after vla is defined and allocated doesn't change the length of vla). All that has to be done to determine sizeof vla is to multiply that stored value by sizeof (double) (or just to retrieve the stored value if it stores the size in bytes).

    sizeof can also be applied to a parenthesized type name:

    int len = 100;
    printf("sizeof (double[len]) = %zu\n", sizeof (double[len]));
    

    According to the standard, the sizeof expression evaluates the type. What does that mean? Clearly it has to evaluate the current value of len. Another example:

    size_t func(void);
    printf("sizeof (double[func()]) = %zu\n", sizeof (double[func()]));
    

    Here the type name includes a function call. Evaluating the sizeof expression must call the function.

    But in all of these cases, there's no actual need to evaluate the elements of the array object (if there is one), and no point in doing so.

    sizeof applied to anything other than a VLA can be evaluated at compile time. The difference when sizeof is applied to a VLA (either an object or a type) is that something has to be evaluated at run time. But the thing that has to be evaluated is not the operand of sizeof; it's just whatever is needed to determine the size of the operand, which is never the operand itself.

    The standard says that the operand of sizeof is evaluated if that operand is of variable length array type. That's a defect in the standard.

    Getting back to the example in the question:

    int foo = 100;
    double (*bar)[foo] = NULL;
    printf("sizeof *bar = %zu\n", sizeof *bar);
    

    I've added an initialization to NULL to make it even clearer that dereferencing bar has undefined behavior.

    *bar is of type double[foo], which is a VLA type. In principle, *bar is evaluated, which would have undefined behavior since bar is uninitialized. But again, there is no need to dereference bar. The compiler will generate some code when it processes the type double[foo], including saving the value of foo (or foo * sizeof (double)) in an anonymous variable. All it has to do to evaluate sizeof *bar is to retrieve the value of that anonymous variable. And if the standard were updated to define the semantics of sizeof consistently, it would be clear that evaluating sizeof *bar is well defined and yields 100 * sizeof (double) without having to dereference bar.

    0 讨论(0)
  • 2020-12-03 08:00

    Indeed the Standard seems to imply that behaviour be undefined:

    re-quoting N1570 6.5.3.4/2:

    The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

    I think the wording from the Standard is confusing: the operand is evaluated does not mean that *bar will be evaluated. Evaluating *bar does not in any way help compute its size. sizeof(*bar) does need to be computed at run time, but the code generated for this has no need to dereference bar, it will more likely retrieve the size information from a hidden variable holding the result of the size computation at the time of bar's instantiation.

    0 讨论(0)
  • 2020-12-03 08:04

    Yes, this causes undefined behaviour.

    In N1570 6.5.3.4/2 we have:

    The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

    Now we have the question: is the type of *bar a variable length array type?

    Since bar is declared as pointer to VLA, dereferencing it should yield a VLA. (But I do not see concrete text specifying whether or not it does).

    Note: Further discussion could be had here, perhaps it could be argued that *bar has type double[100] which is not a VLA.

    Supposing we agree that the type of *bar is actually a VLA type, then in sizeof *bar, the expression *bar is evaluated.

    bar is indeterminate at this point. Now looking at 6.3.2.1/1:

    if an lvalue does not designate an object when it is evaluated, the behavior is undefined

    Since bar does not point to an object (by virtue of being indeterminate), evaluating *bar causes undefined behaviour.

    0 讨论(0)
提交回复
热议问题