Why should I not include cpp files and instead use a header?

后端 未结 13 2340
天涯浪人
天涯浪人 2020-11-22 07:13

So I finished my first C++ programming assignment and received my grade. But according to the grading, I lost marks for including cpp files instead of compiling and li

13条回答
  •  再見小時候
    2020-11-22 07:31

    This is probably a more detailed answer than you wanted, but I think a decent explanation is justified.

    In C and C++, one source file is defined as one translation unit. By convention, header files hold function declarations, type definitions and class definitions. The actual function implementations reside in translation units, i.e .cpp files.

    The idea behind this is that functions and class/struct member functions are compiled and assembled once, then other functions can call that code from one place without making duplicates. Your functions are declared as "extern" implicitly.

    /* Function declaration, usually found in headers. */
    /* Implicitly 'extern', i.e the symbol is visible everywhere, not just locally.*/
    int add(int, int);
    
    /* function body, or function definition. */
    int add(int a, int b) 
    {
       return a + b;
    }
    

    If you want a function to be local for a translation unit, you define it as 'static'. What does this mean? It means that if you include source files with extern functions, you will get redefinition errors, because the compiler comes across the same implementation more than once. So, you want all your translation units to see the function declaration but not the function body.

    So how does it all get mashed together at the end? That is the linker's job. A linker reads all the object files which is generated by the assembler stage and resolves symbols. As I said earlier, a symbol is just a name. For example, the name of a variable or a function. When translation units which call functions or declare types do not know the implementation for those functions or types, those symbols are said to be unresolved. The linker resolves the unresolved symbol by connecting the translation unit which holds the undefined symbol together with the one which contains the implementation. Phew. This is true for all externally visible symbols, whether they are implemented in your code, or provided by an additional library. A library is really just an archive with reusable code.

    There are two notable exceptions. First, if you have a small function, you can make it inline. This means that the generated machine code does not generate an extern function call, but is literally concatenated in-place. Since they usually are small, the size overhead does not matter. You can imagine them to be static in the way they work. So it is safe to implement inline functions in headers. Function implementations inside a class or struct definition are also often inlined automatically by the compiler.

    The other exception is templates. Since the compiler needs to see the whole template type definition when instantiating them, it is not possible to decouple the implementation from the definition as with standalone functions or normal classes. Well, perhaps this is possible now, but getting widespread compiler support for the "export" keyword took a long, long time. So without support for 'export', translation units get their own local copies of instantiated templated types and functions, similar to how inline functions work. With support for 'export', this is not the case.

    For the two exceptions, some people find it "nicer" to put the implementations of inline functions, templated functions and templated types in .cpp files, and then #include the .cpp file. Whether this is a header or a source file doesn't really matter; the preprocessor does not care and is just a convention.

    A quick summary of the whole process from C++ code (several files) and to a final executable:

    • The preprocessor is run, which parses all the directives which starts with a '#'. The #include directive concatenates the included file with inferior, for example. It also does macro-replacement and token-pasting.
    • The actual compiler runs on the intermediate text file after the preprocessor stage, and emits assembler code.
    • The assembler runs on the assembly file and emits machine code, this is usually called an object file and follows the binary executable format of the operative system in question. For example, Windows uses the PE (portable executable format), while Linux uses the Unix System V ELF format, with GNU extensions. At this stage, symbols are still marked as undefined.
    • Finally, the linker is run. All the previous stages were run on each translation unit in order. However, the linker stage works on all the generated object files which were generated by the assembler. The linker resolves symbols and does a lot of magic like creating sections and segments, which is dependent on the target platform and binary format. Programmers aren't required to know this in general, but it surely helps in some cases.

    Again, this was definetely more than you asked for, but I hope the nitty-gritty details helps you to see the bigger picture.

提交回复
热议问题