问题
Quote from The C++ standard library: a tutorial and handbook:
The only portable way of using templates at the moment is to implement them in header files by using inline functions.
Why is this?
(Clarification: header files are not the only portable solution. But they are the most convenient portable solution.)
回答1:
Caveat: It is not necessary to put the implementation in the header file, see the alternative solution at the end of this answer.
Anyway, the reason your code is failing is that, when instantiating a template, the compiler creates a new class with the given template argument. For example:
template<typename T>
struct Foo
{
T bar;
void doSomething(T param) {/* do stuff using T */}
};
// somewhere in a .cpp
Foo<int> f;
When reading this line, the compiler will create a new class (let's call it FooInt
), which is equivalent to the following:
struct FooInt
{
int bar;
void doSomething(int param) {/* do stuff using int */}
}
Consequently, the compiler needs to have access to the implementation of the methods, to instantiate them with the template argument (in this case int
). If these implementations were not in the header, they wouldn't be accessible, and therefore the compiler wouldn't be able to instantiate the template.
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
Foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
#include "Foo.tpp"
Foo.tpp
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
This way, implementation is still separated from declaration, but is accessible to the compiler.
Alternative solution
Another solution is to keep the implementation separated, and explicitly instantiate all the template instances you'll need:
Foo.h
// no implementation
template <typename T> struct Foo { ... };
Foo.cpp
// implementation of Foo's methods
// explicit instantiations
template class Foo<int>;
template class Foo<float>;
// You will only be able to use Foo with int or float
If my explanation isn't clear enough, you can have a look at the C++ Super-FAQ on this subject.
回答2:
It's because of the requirement for separate compilation and because templates are instantiation-style polymorphism.
Lets get a little closer to concrete for an explanation. Say I've got the following files:
- foo.h
- declares the interface of
class MyClass<T>
- declares the interface of
- foo.cpp
- defines the implementation of
class MyClass<T>
- defines the implementation of
- bar.cpp
- uses
MyClass<int>
- uses
Separate compilation means I should be able to compile foo.cpp independently from bar.cpp. The compiler does all the hard work of analysis, optimization, and code generation on each compilation unit completely independently; we don't need to do whole-program analysis. It's only the linker that needs to handle the entire program at once, and the linker's job is substantially easier.
bar.cpp doesn't even need to exist when I compile foo.cpp, but I should still be able to link the foo.o I already had together with the bar.o I've only just produced, without needing to recompile foo.cpp. foo.cpp could even be compiled into a dynamic library, distributed somewhere else without foo.cpp, and linked with code they write years after I wrote foo.cpp.
"Instantiation-style polymorphism" means that the template MyClass<T>
isn't really a generic class that can be compiled to code that can work for any value of T
. That would add overhead such as boxing, needing to pass function pointers to allocators and constructors, etc. The intention of C++ templates is to avoid having to write nearly identical class MyClass_int
, class MyClass_float
, etc, but to still be able to end up with compiled code that is mostly as if we had written each version separately. So a template is literally a template; a class template is not a class, it's a recipe for creating a new class for each T
we encounter. A template cannot be compiled into code, only the result of instantiating the template can be compiled.
So when foo.cpp is compiled, the compiler can't see bar.cpp to know that MyClass<int>
is needed. It can see the template MyClass<T>
, but it can't emit code for that (it's a template, not a class). And when bar.cpp is compiled, the compiler can see that it needs to create a MyClass<int>
, but it can't see the template MyClass<T>
(only its interface in foo.h) so it can't create it.
If foo.cpp itself uses MyClass<int>
, then code for that will be generated while compiling foo.cpp, so when bar.o is linked to foo.o they can be hooked up and will work. We can use that fact to allow a finite set of template instantiations to be implemented in a .cpp file by writing a single template. But there's no way for bar.cpp to use the template as a template and instantiate it on whatever types it likes; it can only use pre-existing versions of the templated class that the author of foo.cpp thought to provide.
You might think that when compiling a template the compiler should "generate all versions", with the ones that are never used being filtered out during linking. Aside from the huge overhead and the extreme difficulties such an approach would face because "type modifier" features like pointers and arrays allow even just the built-in types to give rise to an infinite number of types, what happens when I now extend my program by adding:
- baz.cpp
- declares and implements
class BazPrivate
, and usesMyClass<BazPrivate>
- declares and implements
There is no possible way that this could work unless we either
- Have to recompile foo.cpp every time we change any other file in the program, in case it added a new novel instantiation of
MyClass<T>
- Require that baz.cpp contains (possibly via header includes) the full template of
MyClass<T>
, so that the compiler can generateMyClass<BazPrivate>
during compilation of baz.cpp.
Nobody likes (1), because whole-program-analysis compilation systems take forever to compile , and because it makes it impossible to distribute compiled libraries without the source code. So we have (2) instead.
回答3:
Plenty correct answers here, but I wanted to add this (for completeness):
If you, at the bottom of the implementation cpp file, do explicit instantiation of all the types the template will be used with, the linker will be able to find them as usual.
Edit: Adding example of explicit template instantiation. Used after the template has been defined, and all member functions has been defined.
template class vector<int>;
This will instantiate (and thus make available to the linker) the class and all its member functions (only). Similar syntax works for template functions, so if you have non-member operator overloads you may need to do the same for those.
The above example is fairly useless since vector is fully defined in headers, except when a common include file (precompiled header?) uses extern template class vector<int>
so as to keep it from instantiating it in all the other (1000?) files that use vector.
回答4:
Templates need to be instantiated by the compiler before actually compiling them into object code. This instantiation can only be achieved if the template arguments are known. Now imagine a scenario where a template function is declared in a.h
, defined in a.cpp
and used in b.cpp
. When a.cpp
is compiled, it is not necessarily known that the upcoming compilation b.cpp
will require an instance of the template, let alone which specific instance would that be. For more header and source files, the situation can quickly get more complicated.
One can argue that compilers can be made smarter to "look ahead" for all uses of the template, but I'm sure that it wouldn't be difficult to create recursive or otherwise complicated scenarios. AFAIK, compilers don't do such look aheads. As Anton pointed out, some compilers support explicit export declarations of template instantiations, but not all compilers support it (yet?).
回答5:
Actually, prior to C++11 the standard defined the export
keyword that would make it possible to declare templates in a header file and implement them elsewhere.
None of the popular compilers implemented this keyword. The only one I know about is the frontend written by the Edison Design Group, which is used by the Comeau C++ compiler. All others required you to write templates in header files, because the compiler needs the template definition for proper instantiation (as others pointed out already).
As a result, the ISO C++ standard committee decided to remove the export
feature of templates with C++11.
回答6:
Although standard C++ has no such requirement, some compilers require that all function and class templates need to be made available in every translation unit they are used. In effect, for those compilers, the bodies of template functions must be made available in a header file. To repeat: that means those compilers won't allow them to be defined in non-header files such as .cpp files
There is an export keyword which is supposed to mitigate this problem, but it's nowhere close to being portable.
回答7:
Templates must be used in headers because the compiler needs to instantiate different versions of the code, depending on the parameters given/deduced for template parameters. Remember that a template doesn't represent code directly, but a template for several versions of that code.
When you compile a non-template function in a .cpp
file, you are compiling a concrete function/class. This is not the case for templates, which can be instantiated with different types, namely, concrete code must be emitted when replacing template parameters with concrete types.
There was a feature with the export
keyword that was meant to be used for separate compilation.
The export
feature is deprecated in C++11
and, AFAIK, only one compiler implemented it. You shouldn't make use of export
. Separate compilation is not possible in C++
or C++11
but maybe in C++17
, if concepts make it in, we could have some way of separate compilation.
For separate compilation to be achieved, separate template body checking must be possible. It seems that a solution is possible with concepts. Take a look at this paper recently presented at the standards commitee meeting. I think this is not the only requirement, since you still need to instantiate code for the template code in user code.
The separate compilation problem for templates I guess it's also a problem that is arising with the migration to modules, which is currently being worked.
EDIT: As of August 2020 Modules are already a reality for C++: https://en.cppreference.com/w/cpp/language/modules
回答8:
Even though there are plenty of good explanations above, I'm missing a practical way to separate templates into header and body.
My main concern is avoiding recompilation of all template users, when I change its definition.
Having all template instantiations in the template body is not a viable solution for me, since the template author may not know all if its usage and the template user may not have the right to modify it.
I took the following approach, which works also for older compilers (gcc 4.3.4, aCC A.03.13).
For each template usage there's a typedef in its own header file (generated from the UML model). Its body contains the instantiation (which ends up in a library which is linked in at the end).
Each user of the template includes that header file and uses the typedef.
A schematic example:
MyTemplate.h:
#ifndef MyTemplate_h
#define MyTemplate_h 1
template <class T>
class MyTemplate
{
public:
MyTemplate(const T& rt);
void dump();
T t;
};
#endif
MyTemplate.cpp:
#include "MyTemplate.h"
#include <iostream>
template <class T>
MyTemplate<T>::MyTemplate(const T& rt)
: t(rt)
{
}
template <class T>
void MyTemplate<T>::dump()
{
cerr << t << endl;
}
MyInstantiatedTemplate.h:
#ifndef MyInstantiatedTemplate_h
#define MyInstantiatedTemplate_h 1
#include "MyTemplate.h"
typedef MyTemplate< int > MyInstantiatedTemplate;
#endif
MyInstantiatedTemplate.cpp:
#include "MyTemplate.cpp"
template class MyTemplate< int >;
main.cpp:
#include "MyInstantiatedTemplate.h"
int main()
{
MyInstantiatedTemplate m(100);
m.dump();
return 0;
}
This way only the template instantiations will need to be recompiled, not all template users (and dependencies).
回答9:
It means that the most portable way to define method implementations of template classes is to define them inside the template class definition.
template < typename ... >
class MyClass
{
int myMethod()
{
// Not just declaration. Add method implementation here
}
};
回答10:
Just to add something noteworthy here. One can define methods of a templated class just fine in the implementation file when they are not function templates.
myQueue.hpp:
template <class T>
class QueueA {
int size;
...
public:
template <class T> T dequeue() {
// implementation here
}
bool isEmpty();
...
}
myQueue.cpp:
// implementation of regular methods goes like this:
template <class T> bool QueueA<T>::isEmpty() {
return this->size == 0;
}
main()
{
QueueA<char> Q;
...
}
回答11:
The compiler will generate code for each template instantiation when you use a template during the compilation step. In the compilation and linking process .cpp files are converted to pure object or machine code which in them contains references or undefined symbols because the .h files that are included in your main.cpp have no implementation YET. These are ready to be linked with another object file that defines an implementation for your template and thus you have a full a.out executable.
However since templates need to be processed in the compilation step in order to generate code for each template instantiation that you define, so simply compiling a template separate from it's header file won't work because they always go hand and hand, for the very reason that each template instantiation is a whole new class literally. In a regular class you can separate .h and .cpp because .h is a blueprint of that class and the .cpp is the raw implementation so any implementation files can be compiled and linked regularly, however using templates .h is a blueprint of how the class should look not how the object should look meaning a template .cpp file isn't a raw regular implementation of a class, it's simply a blueprint for a class, so any implementation of a .h template file can't be compiled because you need something concrete to compile, templates are abstract in that sense.
Therefore templates are never separately compiled and are only compiled wherever you have a concrete instantiation in some other source file. However, the concrete instantiation needs to know the implementation of the template file, because simply modifying the typename T
using a concrete type in the .h file is not going to do the job because what .cpp is there to link, I can't find it later on because remember templates are abstract and can't be compiled, so I'm forced to give the implementation right now so I know what to compile and link, and now that I have the implementation it gets linked into the enclosing source file. Basically, the moment I instantiate a template I need to create a whole new class, and I can't do that if I don't know how that class should look like when using the type I provide unless I make notice to the compiler of the template implementation, so now the compiler can replace T
with my type and create a concrete class that's ready to be compiled and linked.
To sum up, templates are blueprints for how classes should look, classes are blueprints for how an object should look. I can't compile templates separate from their concrete instantiation because the compiler only compiles concrete types, in other words, templates at least in C++, is pure language abstraction. We have to de-abstract templates so to speak, and we do so by giving them a concrete type to deal with so that our template abstraction can transform into a regular class file and in turn, it can be compiled normally. Separating the template .h file and the template .cpp file is meaningless. It is nonsensical because the separation of .cpp and .h only is only where the .cpp can be compiled individually and linked individually, with templates since we can't compile them separately, because templates are an abstraction, therefore we are always forced to put the abstraction always together with the concrete instantiation where the concrete instantiation always has to know about the type being used.
Meaning typename T
get's replaced during the compilation step not the linking step so if I try to compile a template without T
being replaced as a concrete value type that is completely meaningless to the compiler and as a result object code can't be created because it doesn't know what T
is.
It is technically possible to create some sort of functionality that will save the template.cpp file and switch out the types when it finds them in other sources, I think that the standard does have a keyword export
that will allow you to put templates in a separate cpp file but not that many compilers actually implement this.
Just a side note, when making specializations for a template class, you can separate the header from the implementation because a specialization by definition means that I am specializing for a concrete type that can be compiled and linked individually.
回答12:
That is exactly correct because the compiler has to know what type it is for allocation. So template classes, functions, enums,etc.. must be implemented as well in the header file if it is to be made public or part of a library (static or dynamic) because header files are NOT compiled unlike the c/cpp files which are. If the compiler doesn't know the type is can't compile it. In .Net it can because all objects derive from the Object class. This is not .Net.
回答13:
If the concern is the extra compilation time and binary size bloat produced by compiling the .h as part of all the .cpp modules using it, in many cases what you can do is make the template class descend from a non-templatized base class for non type-dependent parts of the interface, and that base class can have its implementation in the .cpp file.
回答14:
A way to have separate implementation is as follows.
//inner_foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
//foo.tpp
#include "inner_foo.h"
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
//foo.h
#include <foo.tpp>
//main.cpp
#include <foo.h>
inner_foo has the forward declarations. foo.tpp has the implementation and include inner_foo.h; and foo.h will have just one line, to include foo.tpp.
On compile time, contents of foo.h are copied to foo.tpp and then the whole file is copied to foo.h after which it compiles. This way, there is no limitations, and the naming is consistent, in exchange for one extra file.
I do this because static analyzers for the code break when it does not see the forward declarations of class in *.tpp. This is annoying when writing code in any IDE or using YouCompleteMe or others.
回答15:
I suggest looking at this gcc page which discusses the tradeoffs between the "cfront" and "borland" model for template instantiations.
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Template-Instantiation.html
The "borland" model corresponds to what the author suggests, providing the full template definition, and having things compiled multiple times.
It contains explicit recommendations concerning using manual and automatic template instantiation. For example, the "-repo" option can be used to collect templates which need to be instantiated. Or another option is to disable automatic template instantiations using "-fno-implicit-templates" to force manual template instantiation.
In my experience, I rely on the C++ Standard Library and Boost templates being instantiated for each compilation unit (using a template library). For my large template classes, I do manual template instantiation, once, for the types I need.
This is my approach because I am providing a working program, not a template library for use in other programs. The author of the book, Josuttis, works a lot on template libraries.
If I was really worried about speed, I suppose I would explore using Precompiled Headers https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
which is gaining support in many compilers. However, I think precompiled headers would be difficult with template header files.
回答16:
Another reason that it's a good idea to write both declarations and definitions in header files is for readability. Suppose there's such a template function in Utility.h:
template <class T>
T min(T const& one, T const& theOther);
And in the Utility.cpp:
#include "Utility.h"
template <class T>
T min(T const& one, T const& other)
{
return one < other ? one : other;
}
This requires every T class here to implement the less than operator (<). It will throw a compiler error when you compare two class instances that haven't implemented the "<".
Therefore if you separate the template declaration and definition, you won't be able to only read the header file to see the ins and outs of this template in order to use this API on your own classes, though the compiler will tell you in this case about which operator needs to be overridden.
回答17:
You can actually define your template class inside a .template file rather than a .cpp file. Whoever is saying you can only define it inside a header file is wrong. This is something that works all the way back to c++ 98.
Don't forget to have your compiler treat your .template file as a c++ file to keep the intelli sense.
Here is an example of this for a dynamic array class.
#ifndef dynarray_h
#define dynarray_h
#include <iostream>
template <class T>
class DynArray{
int capacity_;
int size_;
T* data;
public:
explicit DynArray(int size = 0, int capacity=2);
DynArray(const DynArray& d1);
~DynArray();
T& operator[]( const int index);
void operator=(const DynArray<T>& d1);
int size();
int capacity();
void clear();
void push_back(int n);
void pop_back();
T& at(const int n);
T& back();
T& front();
};
#include "dynarray.template" // this is how you get the header file
#endif
Now inside you .template file you define your functions just how you normally would.
template <class T>
DynArray<T>::DynArray(int size, int capacity){
if (capacity >= size){
this->size_ = size;
this->capacity_ = capacity;
data = new T[capacity];
}
// for (int i = 0; i < size; ++i) {
// data[i] = 0;
// }
}
template <class T>
DynArray<T>::DynArray(const DynArray& d1){
//clear();
//delete [] data;
std::cout << "copy" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
}
template <class T>
DynArray<T>::~DynArray(){
delete [] data;
}
template <class T>
T& DynArray<T>::operator[]( const int index){
return at(index);
}
template <class T>
void DynArray<T>::operator=(const DynArray<T>& d1){
if (this->size() > 0) {
clear();
}
std::cout << "assign" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
//delete [] d1.data;
}
template <class T>
int DynArray<T>::size(){
return size_;
}
template <class T>
int DynArray<T>::capacity(){
return capacity_;
}
template <class T>
void DynArray<T>::clear(){
for( int i = 0; i < size(); ++i){
data[i] = 0;
}
size_ = 0;
capacity_ = 2;
}
template <class T>
void DynArray<T>::push_back(int n){
if (size() >= capacity()) {
std::cout << "grow" << std::endl;
//redo the array
T* copy = new T[capacity_ + 40];
for (int i = 0; i < size(); ++i) {
copy[i] = data[i];
}
delete [] data;
data = new T[ capacity_ * 2];
for (int i = 0; i < capacity() * 2; ++i) {
data[i] = copy[i];
}
delete [] copy;
capacity_ *= 2;
}
data[size()] = n;
++size_;
}
template <class T>
void DynArray<T>::pop_back(){
data[size()-1] = 0;
--size_;
}
template <class T>
T& DynArray<T>::at(const int n){
if (n >= size()) {
throw std::runtime_error("invalid index");
}
return data[n];
}
template <class T>
T& DynArray<T>::back(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[size()-1];
}
template <class T>
T& DynArray<T>::front(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[0];
}
来源:https://stackoverflow.com/questions/57486243/undefined-symbols-for-architecture-x86-64-clion