问题
I have a requirement to build an automated system to parse a C++ .h file with a lot of #define
statements in it and do something with the value that each #define
works out to. The .h file has a lot of other junk in it besides the #define
statements.
The objective is to create a key-value list, where the keys are all the keywords defined by the #define
statements and the values are the evaluations of the macros which correspond to the definitions. The #defines
define the keywords with a series of nested macros that ultimately resolve to compile-time integer constants. There are some that do not resolve to compile-time integer constants, and these must be skipped.
The .h file will evolve over time, so the tool cannot be a long hardcoded program which instantiates a variable to be equal to each keyword. I have no control over the contents of the .h file. The only guarantees are that it can be built with a standard C++ compiler, and that more #defines
will be added but never removed. The macro formulas may change at any time.
The options I see for this are:
- Implement a partial (or hook into an existing) C++ compiler and intercept the value of the macros during the preprocessor step.
- Use regexes to dynamically build a source file which will consume all the macros currently defined, then compile and execute the source file to get the evaluated form of all the macros. Somehow (?) skip the macros which do not evaluate to compile-time integer constants. (Also, not sure if regex is expressive enough to capture all possible multi-line macro definitions)
Both of these approaches would add substantial complexity and fragility to the build process for this project which I would like to avoid. Is there a better way to evaluate all the #define
macros in a C++ .h file?
Below is an example of what I am looking to parse:
#ifndef Constants_h
#define Constants_h
namespace Foo
{
#define MAKE_CONSTANT(A, B) (A | (B << 4))
#define MAGIC_NUMBER_BASE 40
#define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
#define MORE_MAGIC_1 345
#define MORE_MAGIC_2 65
// Other stuff...
#define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
#define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
// etc...
#define SKIP_CONSTANT "What?"
// More CONSTANT_N mixed with more other stuff and constants which do
// not resolve to compile-time integers and must be skipped
}
#endif Constants_h
What I need to get out of this is the names and evaluations of all the defines which resolve to compile-time integer constants. In this case, for the defines shown it would be
MAGIC_NUMBER_BASE 40
MAGIC_NUMBER 42
MORE_MAGIC_1 345
MORE_MAGIC_2 65
CONSTANT_1 1887
CONSTANT_2 -42
It doesn't really matter what format this output is in as long as I can work with it as a list of key-value pairs further down the pipe.
回答1:
An approach could be to write a "program generator" that generates a program (the printDefines program) comprising statements like std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;
. Obviously, executing such statements will resolve the respective macros and print out their values.
The list of macros in a header file can be obtained by g++
with an -dM -E' option. Feeding this "program generator" with such a list of #defines will generate a "printDefines.cpp" with all the required
cout`-statements. Compiling and executing the generated printDefines program then yields the final output. It will resolve all the macros, including those that by itself use other macros.
See the following shell script and the following program generator code that together implement this approach:
Script printing the values of #define-statements in "someHeaderfile.h":
# printDefines.sh
g++ -std=c++11 -dM -E someHeaderfile.h > defines.txt
./generateDefinesCpp someHeaderfile.h defines.txt > defines.cpp
g++ -std=c++11 -o defines.o defines.cpp
./defines.o
Code of program generator "generateDefinesCpp":
#include <stdio.h>
#include <string>
#include <iostream>
#include <fstream>
#include <cstring>
using std::cout;
using std::endl;
/*
* Argument 1: name of the headerfile to scan
* Argument 2: name of the cpp-file to generate
* Note: will crash if parameters are not provided.
*/
int main(int argc, char* argv[])
{
cout << "#include<iostream>" << endl;
cout << "#include<stdio.h>" << endl;
cout << "#include \"" << argv[1] << "\"" << endl;
cout << "int main() {" << endl;
std::ifstream headerFile(argv[2], std::ios::in);
std::string buffer;
char macroName[1000];
int macroValuePos;
while (getline(headerFile,buffer)) {
const char *bufferCStr = buffer.c_str();
if (sscanf(bufferCStr, "#define %s %n", macroName, ¯oValuePos) == 1) {
const char* macroValue = bufferCStr+macroValuePos;
if (macroName[0] != '_' && strchr(macroName, '(') == NULL && *macroValue) {
cout << "std::cout << \"" << macroName << "\" << \" \" << (" << macroValue << ") << std::endl;" << std::endl;
}
}
}
cout << "return 0; }" << endl;
return 0;
}
The approach could be optimised such that the intermediate files defines.txt
and defines.cpp
are not necessary; For demonstration purpose, however, they are helpful. When applied to your header file, the content of defines.txt
and defines.cpp
will be as follows:
defines.txt:
#define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
#define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
#define Constants_h
#define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
#define MAGIC_NUMBER_BASE 40
#define MAKE_CONSTANT(A,B) (A | (B << 4))
#define MORE_MAGIC_1 345
#define MORE_MAGIC_2 65
#define OBJC_NEW_PROPERTIES 1
#define SKIP_CONSTANT "What?"
#define _LP64 1
#define __APPLE_CC__ 6000
#define __APPLE__ 1
#define __ATOMIC_ACQUIRE 2
#define __ATOMIC_ACQ_REL 4
...
defines.cpp:
#include<iostream>
#include<stdio.h>
#include "someHeaderfile.h"
int main() {
std::cout << "CONSTANT_1" << " " << (MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)) << std::endl;
std::cout << "CONSTANT_2" << " " << (MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)) << std::endl;
std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;
std::cout << "MAGIC_NUMBER_BASE" << " " << (40) << std::endl;
std::cout << "MORE_MAGIC_1" << " " << (345) << std::endl;
std::cout << "MORE_MAGIC_2" << " " << (65) << std::endl;
std::cout << "OBJC_NEW_PROPERTIES" << " " << (1) << std::endl;
std::cout << "SKIP_CONSTANT" << " " << ("What?") << std::endl;
return 0; }
And the output of executing defines.o
is then:
CONSTANT_1 1887
CONSTANT_2 -9
MAGIC_NUMBER 42
MAGIC_NUMBER_BASE 40
MORE_MAGIC_1 345
MORE_MAGIC_2 65
OBJC_NEW_PROPERTIES 1
SKIP_CONSTANT What?
回答2:
Here is a concept, based on assumptions from a clarification comment.
- only one header
- no includes
- no dependency on the including code file
- no dependency on previously included headers
- no dependency on include order
Main requirements otherwise:
- do not risk influence on binary build process (being the part which makes the actual software product)
- do not try to emulate the binary build compiler/parser
How to:
- make a copy
- include it from a dedicated code file,
which only contains "#include "copy.h";
or directly preprocess the header
(this just feels weirdly against my habits) - delete everything except preprocessor and pragmas, paying attention to line-continuation
- replace all "#define"s by "HaPoDefine", except one (e.g. the first)
- repeat
- preprocess the including code file (most compiler have a switch to do this)
- save the output
- turn another "HaPoDefine" back into "#define"
- until no "HaPoDefine" is left
- harvest all macro expansions from the deltas of intermediate saves
- discard everything which is not of relevance
- since the final actual numerical value is most likely a result of the compiler (not the preprocessor), use a tool like bashs "expr" to calculate values for human-eye readability,
be careful not to risk differences to binary build process - use some regex magic to achieve any desired format
回答3:
Can you use g++
or gcc
with the -E option, and work with that output?
-E Stop after the preprocessing stage; do not run the compiler proper. The output is in the form of preprocessed source code, which is sent to the standard output. Input files which don't require preprocessing are ignored.
With this, I imagine:
- Create the list of all
#define
keys from the source - Run the appropriate command below against the source file(s), and let the GNU preprocessor do its thing
- Grab the preprocessed result from stdout, filter to take only those in integer form, and output it to however you want to represent key/value pairs
One of these two commands:
gcc -E myFile.c
g++ -E myFile.cpp
https://gcc.gnu.org/onlinedocs/gcc-2.95.2/gcc_2.html https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html
来源:https://stackoverflow.com/questions/42844545/evaluate-all-macros-in-a-c-header-file