What is exactly an “invalid conversion specification”?

烈酒焚心 提交于 2019-12-10 16:35:24

问题


As per C11, chapter §7.21.6.1, P9

If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

Till time, my understanding was, for

  char str [] = "Sourav";
  • A statement like printf("%S", str); belong to the first sentence, there exist no CS as %S (UPPERCASE)
  • A statement like printf("%d", str); belongs to the second sentence (mismatch between CS and argument type, but the %d is not an "invalid" CS, anyway)

until advised otherwise by a recent comment thread.

Is my understanding wrong? Can the second statement also be categorized as "invalid" (PS- not "wrong") conversion specifier?


Update: The answer and the comment thread is deleted, here's a snap for <10K users.


回答1:


The "validity" of a conversion specification is determined by the standard paragraphs above the one you quoted:

7.21.6.1 - p4 to p8

Each conversion specification is introduced by the character %. After the %, the following appear in sequence: ...

The flag characters and their meanings are: ...

The conversion specifiers and their meanings are: ...

This here means that any conversion specification that is composed from the elements in the above lists is valid, all others are not in the eyes of the standard. That's why the paragraph in your code mentions two causes of UB. One is a specification that is not according to the grammar, and the other is specification and type mismatch.

The comment you linked to seems to use "invalid" colloquially. I.e. both uses of the conversion specifications are "invalid", since they lead to UB. But only the first is "invalid" from a language lawyer standpoint.




回答2:


The footnote 282 points to Future library directions C11 7.31.11p1:

Lowercase letters may be added to the conversion specifiers and length modifiers in fprintf and fscanf. Other characters may be used in extensions.

so it too hints that invalid conversion specifiers mean those conversion specifications that are not in the list, and of those, lowercase letters might be used by a future C version; and extensions are free to use other letters.


And while non-normative, the C11 Appendix J.2. contains the following:

  • An invalid conversion specification is found in the format for one of the formatted input/output functions, or the strftime or wcsftime function (7.21.6.1, 7.21.6.2, 7.27.3.5, 7.29.2.1, 7.29.2.2, 7.29.5.1).

i.e. an invalid conversion specification to *printf is here paired with invalid conversion specification to strftime - which does not take variable arguments, and the invalidity cannot arise from mismatch between the conversion specification and the corresponding argument;

This can be contrasted with

  • There are insufficient arguments for the format in a call to one of the formatted input/output functions, or an argument does not have an appropriate type (7.21.6.1, 7.21.6.2, 7.29.2.1, 7.29.2.2).

which discusses the mismatch between the arguments and the conversion specifiers, without mentioning the word invalid.




回答3:


To support my understanding (and probably to reasonify the understanding in first place), let me add my two cents.

For a minute, let's see the footnote 282, as mentioned in quote. It says,

See ‘‘future library directions’’ (7.31.11).

and in §7.31.11

Lowercase letters may be added to the conversion specifiers and length modifiers in fprintf and fscanf. Other characters may be used in extensions.

Which mentions nothing about the relation between a CS and its argument (if any). So, the "validity" of a CS is not dependent on the supplied argument.

Now, that said, couple of more pointers

  • Point 1 :: Please note the mention of the phrase "conversion specification", not conversion specifier, in the quote. As per chapter §7.21.6.1/P4,

    Each conversion specification is introduced by the character %. After the %, the following appear in sequence:

    • Zero or more flags [...]

    • An optional minimum field width [...]

    • An optional precision [...]

    • An optional length modifier [...]

    • A conversion specifier character [...]

    and we have definitive lists for all the elements mentioned in

    • P5, field width and precision
    • P6, flags
    • P7, length modifier
    • P8, conversion specifier

    Thereby, there is (or should be) no relation with the supplied argument to identify the "validity" of the conversion specification.

    To complement this understanding, borrowing words from the comment by Ajay Brahmakshatriya

    "I think the operating word here is "corresponding". The first sentence says that if in the string exists a specifier which is not valid. If not, then each specifier is matched with its corresponding argument. Then the second statement says about type matching.....I think the second example does not lie in the first category because "corresponding" is not used"

  • Point 2 :: On the other hand, spec is quite distinct and clear about a "mismatch" between the CS and the supplied corresponding argument type. So, that is a different case altogether.

Now, for example, in case, both the cases are combined, it's hard to tell which condition causes the UB, but it's UB, for more than one reason, for sure.

Example:

   printf("%D", str);

following the question.



来源:https://stackoverflow.com/questions/45588405/what-is-exactly-an-invalid-conversion-specification

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!