After few weeks break, I\'m trying to expand and extend my knowlege of templates with the book Templates – The Complete Guide by David Vandevoorde and Nicolai
Directly copied from https://docs.microsoft.com/en-us/cpp/cpp/explicit-instantiation:
You can use explicit instantiation to create an instantiation of a templated class or function without actually using it in your code. Because this is useful when you are creating library (.lib) files that use templates for distribution, uninstantiated template definitions are not put into object (.obj) files.
(For instance, libstdc++ contains the explicit instantiation of std::basic_string<char,char_traits<char>,allocator<char> >
(which is std::string
) so every time you use functions of std::string
, the same function code doesn't need to be copied to objects. The compiler only need to refer (link) those to libstdc++.)
If you define a template class that you only want to work for a couple of explicit types.
Put the template declaration in the header file just like a normal class.
Put the template definition in a source file just like a normal class.
Then, at the end of the source file, explicitly instantiate only the version you want to be available.
Silly example:
// StringAdapter.h
template<typename T>
class StringAdapter
{
public:
StringAdapter(T* data);
void doAdapterStuff();
private:
std::basic_string<T> m_data;
};
typedef StringAdapter<char> StrAdapter;
typedef StringAdapter<wchar_t> WStrAdapter;
Source:
// StringAdapter.cpp
#include "StringAdapter.h"
template<typename T>
StringAdapter<T>::StringAdapter(T* data)
:m_data(data)
{}
template<typename T>
void StringAdapter<T>::doAdapterStuff()
{
/* Manipulate a string */
}
// Explicitly instantiate only the classes you want to be defined.
// In this case I only want the template to work with characters but
// I want to support both char and wchar_t with the same code.
template class StringAdapter<char>;
template class StringAdapter<wchar_t>;
Main
#include "StringAdapter.h"
// Note: Main can not see the definition of the template from here (just the declaration)
// So it relies on the explicit instantiation to make sure it links.
int main()
{
StrAdapter x("hi There");
x.doAdapterStuff();
}
Explicit instantiation allows to reduce compile times and object sizes
These are the major gains it can provide. They come from the following two effects described in detail in the sections below:
Remove definitions from headers
Explicit instantiation allows you to leave definitions in the .cpp file.
When the definition is on the header and you modify it, an intelligent build system would recompile all includers, which could be dozens of files, making compilation unbearably slow.
Putting definitions in .cpp files does have the downside that external libraries can't reuse the template with their own new classes, but "Remove definitions from included headers but also expose templates an external API" below shows a workaround.
See concrete examples below.
Object redefinition gains: understanding the problem
If you just completely define a template on a header file, every single compilation unit that includes that header ends up compiling its own implicit copy of the template for every different template argument usage made.
This means a lot of useless disk usage and compilation time.
Here is a concrete example, in which both main.cpp
and notmain.cpp
implicitly define MyTemplate<int>
due to its usage in those files.
main.cpp
#include <iostream>
#include "mytemplate.hpp"
#include "notmain.hpp"
int main() {
std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}
notmain.cpp
#include "mytemplate.hpp"
#include "notmain.hpp"
int notmain() { return MyTemplate<int>().f(1); }
mytemplate.hpp
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP
template<class T>
struct MyTemplate {
T f(T t) { return t + 1; }
};
#endif
notmain.hpp
#ifndef NOTMAIN_HPP
#define NOTMAIN_HPP
int notmain();
#endif
GitHub upstream.
Compile and view symbols with nm
:
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o notmain.o notmain.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++ -Wall -Wextra -std=c++11 -pedantic-errors -o main.out notmain.o main.o
echo notmain.o
nm -C -S notmain.o | grep MyTemplate
echo main.o
nm -C -S main.o | grep MyTemplate
Output:
notmain.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
main.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
From man nm
, we see that W
means weak symbol, which GCC chose because this is a template function. Weak symbol means that the compiled implicitly generated code for MyTemplate<int>
was compiled on both files.
The reason it doesn't blow up at link time with multiple definitions is that the linker accepts multiple weak definitions, and just picks one of them to put in the final executable.
The numbers in the output mean:
0000000000000000
: address within section. This zero is because templates are automatically put into their own section0000000000000017
: size of the code generated for themWe can see this a bit more clearly with:
objdump -S main.o | c++filt
which ends in:
Disassembly of section .text._ZN10MyTemplateIiE1fEi:
0000000000000000 <MyTemplate<int>::f(int)>:
0: f3 0f 1e fa endbr64
4: 55 push %rbp
5: 48 89 e5 mov %rsp,%rbp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: 89 75 f4 mov %esi,-0xc(%rbp)
f: 8b 45 f4 mov -0xc(%rbp),%eax
12: 83 c0 01 add $0x1,%eax
15: 5d pop %rbp
16: c3 retq
and _ZN10MyTemplateIiE1fEi
is the mangled name of MyTemplate<int>::f(int)>
which c++filt
decided not to unmangle.
So we see that a separate section is generated for every single method instantiation, and that each of of them takes of course space in the object files.
Solutions to the object redefinition problem
This problem can be avoided by using explicit instantiation and either:
keep definition on hpp and add extern template
on hpp for types which are going to be explicitly instantiated.
As explained at: using extern template (C++11) extern template
prevents a completely defined template from being instantiated by compilation units, except for our explicit instantiation. This way, only our explicit instantiation will be defined in the final objects:
mytemplate.hpp
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP
template<class T>
struct MyTemplate {
T f(T t) { return t + 1; }
};
extern template class MyTemplate<int>;
#endif
mytemplate.cpp
#include "mytemplate.hpp"
// Explicit instantiation required just for int.
template class MyTemplate<int>;
main.cpp
#include <iostream>
#include "mytemplate.hpp"
#include "notmain.hpp"
int main() {
std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}
notmain.cpp
#include "mytemplate.hpp"
#include "notmain.hpp"
int notmain() { return MyTemplate<int>().f(1); }
Downside:
int
, it seems that you are forced to add the include for it on the header, a forward declaration is not enough: extern template & incomplete types This increases header dependencies a bit.moving the definition on the cpp file, leave only declaration on hpp, i.e. modify the original example to be:
mytemplate.hpp
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP
template<class T>
struct MyTemplate {
T f(T t);
};
#endif
mytemplate.cpp
#include "mytemplate.hpp"
template<class T>
T MyTemplate<T>::f(T t) { return t + 1; }
// Explicit instantiation.
template class MyTemplate<int>;
Downside: external projects can't use your template with their own types. Also you are forced to explicitly instantiate all types. But maybe this is an upside since then programmers won't forget.
keep definition on hpp and add extern template
on every includer:
mytemplate.cpp
#include "mytemplate.hpp"
// Explicit instantiation.
template class MyTemplate<int>;
main.cpp
#include <iostream>
#include "mytemplate.hpp"
#include "notmain.hpp"
// extern template declaration
extern template class MyTemplate<int>;
int main() {
std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}
notmain.cpp
#include "mytemplate.hpp"
#include "notmain.hpp"
// extern template declaration
extern template class MyTemplate<int>;
int notmain() { return MyTemplate<int>().f(1); }
Downside: all includers have to add the extern
to their CPP files, which programmers will likely forget to do.
With either of those solutions, nm
now contains:
notmain.o
U MyTemplate<int>::f(int)
main.o
U MyTemplate<int>::f(int)
mytemplate.o
0000000000000000 W MyTemplate<int>::f(int)
so we see have only mytemplate.o
has a compilation of MyTemplate<int>
as desired, while notmain.o
and main.o
don't because U
means undefined.
Remove definitions from included headers but also expose templates an external API in a header-only library
If your library is not header only, the extern template
method will work, since using projects will just link to your object file, which will contain the object of the explicit template instantiation.
However, for header only libraries, if you want to both:
then you can try one of the following:
mytemplate.hpp
: template definitionmytemplate_interface.hpp
: template declaration only matching the definitions from mytemplate_interface.hpp
, no definitionsmytemplate.cpp
: include mytemplate.hpp
and make explicit instantitationsmain.cpp
and everywhere else in the code base: include mytemplate_interface.hpp
, not mytemplate.hpp
mytemplate.hpp
: template definitionmytemplate_implementation.hpp
: includes mytemplate.hpp
and adds extern
to every class that will be instantiatedmytemplate.cpp
: include mytemplate.hpp
and make explicit instantitationsmain.cpp
and everywhere else in the code base: include mytemplate_implementation.hpp
, not mytemplate.hpp
Or even better perhaps for multiple headers: create an intf
/impl
folder inside your includes/
folder and use mytemplate.hpp
as the name always.
The mytemplate_interface.hpp
approach looks like this:
mytemplate.hpp
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP
#include "mytemplate_interface.hpp"
template<class T>
T MyTemplate<T>::f(T t) { return t + 1; }
#endif
mytemplate_interface.hpp
#ifndef MYTEMPLATE_INTERFACE_HPP
#define MYTEMPLATE_INTERFACE_HPP
template<class T>
struct MyTemplate {
T f(T t);
};
#endif
mytemplate.cpp
#include "mytemplate.hpp"
// Explicit instantiation.
template class MyTemplate<int>;
main.cpp
#include <iostream>
#include "mytemplate_interface.hpp"
int main() {
std::cout << MyTemplate<int>().f(1) << std::endl;
}
Compile and run:
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o mytemplate.o mytemplate.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++ -Wall -Wextra -std=c++11 -pedantic-errors -o main.out main.o mytemplate.o
Output:
2
Tested in Ubuntu 18.04.
C++20 modules
https://en.cppreference.com/w/cpp/language/modules
I think this feature will provide the best setup going forward as it becomes available, but I haven't checked it yet because it is not yet available on my GCC 9.2.1.
You will still have to do explicit instantiation to get the speedup/disk saving, but at least we will have a sane solution for "Remove definitions from included headers but also expose templates an external API" which does not require copying things around 100 times.
Expected usage (without the explicit insantiation, not sure what the exact syntax will be like, see: How to use template explicit instantiation with C++20 modules?) be something along:
helloworld.cpp
export module helloworld; // module declaration
import <iostream>; // import declaration
template<class T>
export void hello(T t) { // export declaration
std::cout << t << std::end;
}
main.cpp
import helloworld; // import declaration
int main() {
hello(1);
hello("world");
}
and then compilation mentioned at https://quuxplusone.github.io/blog/2019/11/07/modular-hello-world/
clang++ -std=c++2a -c helloworld.cpp -Xclang -emit-module-interface -o helloworld.pcm
clang++ -std=c++2a -c -o helloworld.o helloworld.cpp
clang++ -std=c++2a -fprebuilt-module-path=. -o main.out main.cpp helloworld.o
So from this we see that clang can extract the template interface + implementation into the magic helloworld.pcm
, which must contain some LLVM intermediate representation of the source: How are templates handled in C++ module system? which still allows for template specification to happen.
How to quickly analyze your build to see if it would gain a lot from template instantiation
So, you've got a complex project and you want to decide if template instantiation will bring significant gains without actually doing the full refactor?
The analysis below might help you decide, or at least select the most promising objects to refactor first while you experiment, by borrowing some ideas from: My C++ object file is too big
# List all weak symbols with size only, no address.
find . -name '*.o' | xargs -I{} nm -C --size-sort --radix d '{}' |
grep ' W ' > nm.log
# Sort by symbol size.
sort -k1 -n nm.log -o nm.sort.log
# Get a repetition count.
uniq -c nm.sort.log > nm.uniq.log
# Find the most repeated/largest objects.
sort -k1,2 -n nm.uniq.log -o nm.uniq.sort.log
# Find the objects that would give you the most gain after refactor.
# This gain is calculated as "(n_occurences - 1) * size" which is
# the size you would gain for keeping just a single instance.
# If you are going to refactor anything, you should start with the ones
# at the bottom of this list.
awk '{gain = ($1 - 1) * $2; print gain, $0}' nm.uniq.sort.log |
sort -k1 -n > nm.gains.log
# Total gain if you refactored everything.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.gains.log
# Total size. The closer total gain above is to total size, the more
# you would gain from the refactor.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.log
The dream: a template compiler cache
I think the ultimate solution would be if we could build with:
g++ --template-cache myfile.o file1.cpp
g++ --template-cache myfile.o file2.cpp
and then myfile.o
would automatically reuse previously compiled templates across files.
This would mean 0 extra effort on the programmers besides passing that extra CLI option to your build system.
A secondary bonus of explicit template instantiation: help IDEs list template instantiations
I've found that some IDEs such as Eclipse cannot resolve "a list of all template instantiations used".
So e.g., if you are inside a templated code, and you want to find possible values of the template, you would have to find the constructor usages one by one and deduce the possible types one by one.
But on Eclipse 2020-03 I can easily list explicitly instantiated templates by doing a Find all usages (Ctrl + Alt + G) search on the class name, which points me e.g. from:
template <class T>
struct AnimalTemplate {
T animal;
AnimalTemplate(T animal) : animal(animal) {}
std::string noise() {
return animal.noise();
}
};
to:
template class AnimalTemplate<Dog>;
Here's a demo: https://github.com/cirosantilli/ide-test-projects/blob/e1c7c6634f2d5cdeafd2bdc79bcfbb2057cb04c4/cpp/animal_template.hpp#L15
Another guerrila technique you could use outside of the IDE however would be to run nm -C
on the final executable and grep the template name:
nm -C main.out | grep AnimalTemplate
which directly points to the fact that Dog
was one of the instantiations:
0000000000004dac W AnimalTemplate<Dog>::noise[abi:cxx11]()
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)
It depends on the compiler model - apparently there is the Borland model and the CFront model. And then it depends also on your intention - if your are writing a library, you might (as alluded above) explicitly instantiate the specializations you want.
The GNU c++ page discusses the models here https://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Template-Instantiation.html.