Explicit direct #include vs. Non-contractual transitive #include

后端 未结 6 2166
傲寒
傲寒 2021-02-12 12:47

Say we have this header file:

MyClass.hpp

#pragma once
#include 

class MyClass
{
public:
    MyClass(double);

    /* ... */

private:
            


        
相关标签:
6条回答
  • 2021-02-12 13:06

    As others have said, it is safer to directly include the files you use, in terms of being protected from future changes to the file you're relying on to forward it.

    It is also generally considered cleaner to have your dependencies immediately there. If you want to check what this "MyClass" object is, you want to just scroll to the top and ask your IDE to take you to the relevant header.

    It's worth noting that it's safe to include the same standard header multiple times, as provided by a standard library guarantee. In practice, that means that the implementation of (in say clang's libc++) will start with an #include guard. Modern compilers are so familiar with the include guard idiom (especially as applied by their own standard library implementations) that they can avoid even loading the files. So the only thing that you lose in exchange for that safety and clarity is having to type an extra dozen or so letters.

    All that being agreed with everyone else, I have re-read it and I don't think your question was actually "Should I do this?" so much as "Why am I even allowed to not do this?" Or "Why doesn't the compiler insulate me from my includes' includes?"

    There is one important exception to the "directly include what you use" rule. That is headers which, as part of their specification, include additional headers. For example <iostream> (which is of course itself part of the standard library) is guaranteed as of c++11 to include <istream> and <ostream>. One might say "why not just have the contents of <istream> and <ostream> moved into <iostream> directly?" but there are clarity and compilation speed advantages to having the option of splitting them up if only one is needed. (And, no doubt for c++, there are historical reasons too) You can of course do this for your own headers too. (It's more of an Objective-C thing, but they have the same include mechanics and conventionally use them for umbrella headers, whose sole job is to include other files.)

    There is another fundamental reason that headers your includes include get included. That is that, in general, your headers don't make sense without them. Suppose that your MyClass.hpp file contains the following type synonym

    using NumberPack = std::vector<unsigned int>;
    

    and the following self-descriptive function

    NumberPack getFirstTenNumers();
    

    Now suppose that another file includes MyClass.hpp and has the following.

    NumberPack counter = getFirstTenNumbers();
    for (auto c : counter) {
        std::cout << c << "\n"
    }
    

    What's going on here is that you may not want to write into your code that you're using <vector>. That is an implementation detail that you don't want to have to worry about. NumberPack could, as far as you're concerned, be implemented as some other container or an iterator or a generator type thing or something else, so long as it follows its spec. But the compiler needs to know what it actually is: it can't make effective use of parent dependencies without knowing what the grandparent dependency headers are. A side effect of that is that you get away with using them.

    Or, of course, the third reason is just "Because that's not C++." Yes, one could have a language in which did not get second generation dependencies passed down, or you had to expressly request it. It's just that it would be a different language, and in particular would not fit into the old text include based style of c++ or friends.

    0 讨论(0)
  • 2021-02-12 13:11

    If your MyClass has a member of type std::vector<double> then the header that defines MyClass needs to #include <vector>. Otherwise, the only way users of MyClass can compile is if they #include <vector> before including the definition of MyClass.

    Although the member is private, it is still part of the class, so the compiler needs to see a complete type definition. Otherwise, it cannot do things such as compute sizeof(MyClass), or instantiate any MyClass objects.

    If you want to break the dependency between your header and <vector> there are techniques. For example, the pimpl ("pointer to implementation") idiom.

    class MyClass 
    {
    public:
        MyClass(double first_value);
    
        /* ... */
    
    private:
        void *pimpl;
    };
    

    and, in the source file that defines members of the class;

    #include <vector>
    #include "MyClass.hpp"
    
    MyClass::MyClass(double first_value) : pimpl(new std::vector<double>())
    {
    
    }
    

    (and also, presumably, do something with first_value, but I have omitted that).

    The tradeoff is that every member function that needs to use the vector needs to obtain it from the pimpl. For example, if you want to get a reference to the allocated vector

    void MyClass::some_member_function()
    {
        std::vector<double> &internal_data = *static_cast<std::vector<double> *>(pimpl);
    
    }
    

    The destructor of MyClass will also need to release the dynamically allocated vector.

    This also limits some options for the class definition. For example, MyClass cannot have a member function that returns a std::vector<double> by value (unless you #include <vector>)

    You'll need to decide if techniques like the pimpl idiom are worth the effort to make your class work. Personally, unless there is some OTHER compelling reasons to separate the class implementation from the class using the pimpl idiom, I would simply accept the need for #include <vector> in your header file.

    0 讨论(0)
  • 2021-02-12 13:16

    to prevent a dependency on the internal workings of MyClass. Or should I?

    Yes, you should and for pretty much for that reason. Unless you want to specify that MyClass.hpp is guaranteed to include <vector>, you cannot rely on one including the other. And there is no good reason to be forced to provide such guarantee. If there is no such guarantee, then you rely on an implementation detail of MyClass.hpp that may change in future, which will break your code.

    I obviously realise that MyClass needs vector to work.

    Does it? Couldn't it use for example boost::container::small_vector instead?

    In this example MyClass needs std::vector

    But what about the needs of MyClass in future? Programs evolve, and what a class needs today is not always the same that the class needs tomorrow.

    But would it not be good to be able to decide which headers get exposed when importing

    Preventing transitive inclusion is not possible.

    Modules introduced in C++20 are a feature that may be used instead of pp-inclusion and are intended to help solve this.

    Right now, you can avoid including any implementation detail dependencies by using the PIMPL pattern ("Pointer to implementation"). But PIMPL introduces a layer of indirection and more significantly, requires dynamic allocation which has performance implications. Depending on context, these implications may be negligible or significant.

    0 讨论(0)
  • 2021-02-12 13:18

    Yes, the using file should include <vector> explicitly, as that is a dependency it needs.

    However, I wouldn't fret. If someone refactors MyClass.hpp to remove the <vector> include, the compiler will point them at every single file that was lacking the explicit <vector> include, relying on the implicit include. It is usually a no-brainer to fix this type of errors, and once the code compiles again, some of the missing explicit includes will have been fixed.

    In the end, the compiler is much more efficient at spotting missing includes than any human being.

    0 讨论(0)
  • 2021-02-12 13:22

    You should use explicit #includes to have a non destructive workflow. Let's say that MyClass is used in 50 different source files. They don't include vector. Suddenly, you have to change std::vector in MyClass.h for some other container. Then all the 50 source files will either need to include vector or you will need to leave it in MyClass.h. This would be redundant and it could increase application size, compilation time and even run time (static variable initialization) unnecessarily.

    0 讨论(0)
  • 2021-02-12 13:24

    Consider that code is not just to be written once but it evolves over time.

    Lets assume you wrote the code and now my task would be to refactor it. For some reason I want to replace MyClass with YourClass and lets say they have the same interface. I would simply have to replace any occurence of MyClass with YourClass to arrive at this:

    /* Version 1: SomeOtherHeader.hpp */
    
    #pragma once
    #include "YourClass.hpp"
    
    void func(const YourClass& a, const std::vector<double>& b);
    

    I did everything correct, but still the code would fail to compile (because YourClass is not including std::vector). In this particular example I would get a clear error message and the fix would be obvious. However, things can get messy rather fast if such dependencies span across several headers, if there are many of such dependencies and if the SomeOtherHeader.hpp contains more than just a single declaration.

    There are more things that can go wrong. Eg the author of MyClass could decided that they actually can drop the include in favor of a forward declaration. Also then SomeOtherHeader will break. It boils down to: If you do not include vector in SomeOtherHeader then there is a hidden dependency, which is bad.

    The rule of thumb to prevent such problems is: Include what you use.

    0 讨论(0)
提交回复
热议问题