Is ‘int main;’ a valid C/C++ program?

后端 未结 9 1629
小蘑菇
小蘑菇 2020-12-23 15:32

I ask because my compiler seems to think so, even though I don’t.

echo \'int main;\' | cc -x c - -Wall
echo \'int main;\' | c++ -x c++ - -Wall

相关标签:
9条回答
  • 2020-12-23 16:03

    It is a warning as it is not technically disallowed. The startup code will use the symbol location of "main" and jump to it with the three standard arguments (argc, argv and envp). It does not, and at link time cannot check that it's actually a function, nor even that it has those arguments. This is also why int main(int argc, char **argv) works - the compiler doesn't know about the envp argument and it just happens not to be used, and it is caller-cleanup.

    As a joke, you could do something like

    int main = 0xCBCBCBCB;
    

    on an x86 machine and, ignoring warnings and similar stuff, it will not just compile but actually work too.

    Somebody used a technique similar to this to write an executable (sort of) that runs on multiple architectures directly - http://phrack.org/issues/57/17.html#article . It was also used to win the IOCCC - http://www.ioccc.org/1984/mullender/mullender.c .

    0 讨论(0)
  • 2020-12-23 16:03

    Is it a valid program?

    No.

    It is not a program as it has no executable parts.

    Is it valid to compile?

    Yes.

    Can it be used with a valid program?

    Yes.

    Not all compiled code is required to be executable to be valid. Examples are static and dynamic libraries.

    You have effectively built an object file. It is not a valid executable, however another program could link to the object main in the resultant file by loading it at runtime.

    Should this be an error?

    Traditionally, C++ allows the user to do things that may be seem like they have no valid use but that fit with the syntax of the language.

    I mean that sure, this could be reclassified as an error, but why? What purpose would that serve that the warning does not?

    So long as there is a theoretical possibility of this functionality being used in actual code, it is very unlikely that having a non-function object called main would result in an error according to the language.

    0 讨论(0)
  • 2020-12-23 16:04

    For C so far it is implementation defined behavior.

    As the ISO/IEC9899 says:

    5.1.2.2.1 Program startup

    1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

    int main(void) { /* ... */ }

    or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

    int main(int argc, char *argv[]) { /* ... */ }

    or equivalent; or in some other implementation-defined manner.

    0 讨论(0)
  • 2020-12-23 16:05

    No, this is not a valid program.

    For C++ this was recently explicitly made ill-formed by defect report 1886: Language linkage for main() which says:

    There does not appear to be any restriction on giving main() an explicit language linkage, but it should probably be either ill-formed or conditionally-supported.

    and part of the resolution included the following change:

    A program that declares a variable main at global scope or that declares the name main with C language linkage (in any namespace) is ill-formed.

    We can find this wording in the latest C++ draft standard N4527 which is the the C++1z draft.

    The latest versions of both clang and gcc now make this an error (see it live):

    error: main cannot be declared as global variable
    int main;
    ^
    

    Before this defect report, it was undefined behavior which does not require a diagnostic. On the other hand ill-formed code requires a diagnostic, the compiler can either make this a warning or an error.

    0 讨论(0)
  • 2020-12-23 16:08

    My point, I suppose, is that I really think this should be an error in a hosted environment, eh?

    The error is yours. You didn't specify a function named main that returns an int and tried to use your program in a hosted environment.

    Suppose you have a compilation unit that defines a global variable named main. This might well be legal in a freestanding environment because what constitutes a program is left up to the implementation in freestanding environments.

    Suppose you have another compilation unit that defines a global function named main that returns an int and takes no arguments. This is exactly what a program in a hosted environment needs.

    Everything's fine if you only use the first compilation unit in a freestanding environment and only use the second in a hosted environment. What if you use both in one program? In C++, you've violated the one definition rule. That is undefined behavior. In C, you've violated the rule that dictates that all references to a single symbol must be consistent; if they aren't it's undefined behavior. Undefined behavior is a "get out of jail, free!" card to developers of an implementation. Anything an implementation does in response to undefined behavior is compliant with the standard. The implementation doesn't have to warn about, let alone detect, undefined behavior.

    What if you use only one of those compilation units, but you use the wrong one (which is what you did)? In C, the situation is clear-cut. Failure to define the function main in one of the two standard forms in a hosted environment is undefined behavior. Suppose you didn't define main at all. The compiler/linker doesn't haven't to say a thing about this error. That they do complain is a nicety on their behalf. That the C program compiled and linked without error is your fault, not the compiler's.

    It's a bit less clear in C++ because failure to define the function main in a hosted environment is an error rather than undefined behavior (in other words, it must be diagnosed). However, the one definition rule in C++ means linkers can be rather dumb. The linker's job is resolving external references, and thanks to the one definition rule, the linker doesn't have to know what those symbols mean. You provided a symbol named main, the linker is expecting to see a symbol named main, so all is good as far as the linker is concerned.

    0 讨论(0)
  • 2020-12-23 16:17

    Since the question is double-tagged as C and C++, the reasoning for C++ and C would be different:

    • C++ uses name mangling to help linker distinguish between textually identical symbols of different types, e.g. a global variable xyz and a free-standing global function xyz(int). However, the name main is never mangled.
    • C does not use mangling, so it is possible for a program to confuse linker by providing a symbol of one kind in place of a different symbol, and have the program successfully link.

    That is what's going on here: the linker expects to find symbol main, and it does. It "wires" that symbol as if it were a function, because it does not know any better. The portion of runtime library that passes control to main asks linker for main, so linker gives it symbol main, letting the link phase to complete. Of course this fails at runtime, because main is not a function.

    Here is another illustration of the same issue:

    file x.c:

    #include <stdio.h>
    int foo(); // <<== main() expects this
    int main(){
        printf("%p\n", (void*)&foo);
        return 0;
    }
    

    file y.c:

    int foo; // <<== external definition supplies a symbol of a wrong kind
    

    compiling:

    gcc x.c y.c
    

    This compiles, and it would probably run, but it's undefined behavior, because the type of the symbol promised to the compiler is different from the actual symbol supplied to the linker.

    As far as the warning goes, I think it is reasonable: C lets you build libraries that have no main function, so the compiler frees up the name main for other uses if you need to define a variable main for some unknown reason.

    0 讨论(0)
提交回复
热议问题