What sort of tokens are required to be allowed by the standard in includes? E.g., are spaces in file names allowed?
From cppreference on Source file inclusion
Any preprocessing tokens (macro constants or expressions) are permitted as arguments to #include and __has_include (since C++17) as long as they expand to a sequence of characters surrounded by < > or "".
Then in Explanation
Searches for the file in implementation-defined manner. The intent of this syntax is to search for the files that are not controlled by the implementation.
Furthermore, the c++20 final working draft 5.8 Header names [lex.header] and
ISO/IEC 9899:1999 6.4.7 Header names except newline, >
and "
.
header-name:
< h-char-sequence >
" q-char-sequence "
h-char-sequence :
h-char
h-char-sequence h-char
h-char:
any member of the source character set except new-line and >
q-char-sequence :
q-char
q-char-sequence q-char
q-char:
any member of the source character set except new-line and "
(This answer is for C, with correct citations and quotes. It does not cover C++.)
C 2018 6.10.2 paragraphs 2 to 4 say:
2 A preprocessing directive of the form
# include < h-char-sequence > new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
3 A preprocessing directive of the form
# include " q-char-sequence " new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
# include < h-char-sequence > new-line
with the identical contained sequence (including > characters, if any) from the original directive.
4 A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text. (Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens.) The directive resulting after all replacements shall match one of the two previous forms. The method by which a sequence of preprocessing tokens between a < and a > preprocessing token pair or a pair of " characters is combined into a single header name preprocessing token is implementation-defined.
The grammar symbols h-char-sequence and q-char-sequence are defined in 6.4.7. An h-char-sequence is < followed by any members of the source character set (at least one) other than new-line or > and then terminated by a >. A q-char-sequence* is the same with < and > replaced by ". However, the behavior is undefined if the characters ', \, //, or /* appear in either sequence or if " appears in an h-char-sequence.