A colleague recently revealed to me that a single source file of ours includes over 3,400 headers during compile time. We have over 1,000 translation units that get compiled
The output of gcc -w -H <file>
might be useful (If you parse it and put some counts in) the -w
is there to suppress all warnings, which might be awkward to deal with.
From the gcc docs:
-H
Print the name of each header file used, in addition to other normal activities. Each name is indented to show how deep in the
#include
stack it is. Precompiled header files are also printed, even if they are found to be invalid; an invalid precompiled header file is printed with...x
and a valid one with...!
.
The output looks like this:
. /usr/include/unistd.h
.. /usr/include/features.h
... /usr/include/bits/predefs.h
... /usr/include/sys/cdefs.h
.... /usr/include/bits/wordsize.h
... /usr/include/gnu/stubs.h
.... /usr/include/bits/wordsize.h
.... /usr/include/gnu/stubs-64.h
.. /usr/include/bits/posix_opt.h
.. /usr/include/bits/environments.h
... /usr/include/bits/wordsize.h
.. /usr/include/bits/types.h
... /usr/include/bits/wordsize.h
... /usr/include/bits/typesizes.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/bits/confname.h
.. /usr/include/getopt.h
. /usr/include/stdio.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/libio.h
... /usr/include/_G_config.h
.... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.... /usr/include/wchar.h
... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stdarg.h
.. /usr/include/bits/stdio_lim.h
.. /usr/include/bits/sys_errlist.h
Multiple include guards may be useful for:
/usr/include/bits/confname.h
/usr/include/bits/environments.h
/usr/include/bits/predefs.h
/usr/include/bits/stdio_lim.h
/usr/include/bits/sys_errlist.h
/usr/include/bits/typesizes.h
/usr/include/gnu/stubs-64.h
/usr/include/gnu/stubs.h
/usr/include/wchar.h
Personally I don't know if there is a tool that will say "Remove this file". It's really a complex matter that depends on a lot of things. Looking at a tree of include statements is surely going to drive you nuts.... It would drive me crazy, as well as ruin my eyes. There are better ways to do things to reduce your compile times.
GCC Has a flag (-save-temps) with which you can save intermediate files. This includes .ii files, which are the results of the preprocessor (so before compilation). You can write a script to parse this and determine the weight/cost/size of what is included, as well as the dependency tree.
I wrote a Python script to do just this (publicly available here: https://gitlab.com/p_b_omta/gcc-include-analyzer).
"Large Scale C++ Software Design" by John Lakos had tools that extracted the compile-time dependencies among source files.
Unfortunately, their repository on Addison-Wesley's site is gone (along with AW's site itself), but I found a tarball here: http://prdownloads.sourceforge.net/introspector/LSC-rpkg-0.1.tgz?download
I found it useful several jobs ago, and it has the virtue of being free.
BTW, if you haven't read Lakos's book, it sounds like your project would benefit. (The current edition is a bit dated, but I hear that Lakos has another book coming out in 2012.)
GCC has a -M
flag that will output a list of dependencies for a given source file. You could use that information to figure out which of your files have the most dependencies, which files are most depended on, etc.
Check out the man page for more information. There are several variants of -M
.
If you are using gcc/g++, the -M or -MM option will output a line with the information you seek. (The former will include system headers while the latter will not. There are other variants; see the manual.)
$ gcc -M -c foo.c
foo.o: foo.c /usr/include/stdint.h /usr/include/features.h \
/usr/include/sys/cdefs.h /usr/include/bits/wordsize.h \
/usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \
/usr/include/bits/wchar.h
You would need to remove the foo.o: foo.c
at the beginning, but the rest is a list of all headers that the file depends on, so it would not be too hard to write a script to gather these and summarize them.
Of course this suggestion is only useful on Unix and only if nobody else has a better idea. :-)