问题
I've wrote a test program (parse_ast.c) to parse a c source file (tt.c) to see how libclang works, the output is the hierarchical structure of the AST:
Here is the test file:
/* tt.c */ // line 1
#include <unistd.h>
#include <stdio.h>
typedef ssize_t (*write_fn_t)(int, const void *, size_t);
void indirect_write(write_fn_t write_fn) { // line 7
(*write_fn)(1, "indirect call\n", 14);
}
void direct_write() { // line 11
write(1, "direct call\n", 12); // line 12 mising in the ast?
}
int main() { // line 15
direct_write();
indirect_write(write); // line 17 missing in the ast?
return 0;
}
The output shows something like this:
...
...
inclusion directive at tt.c (2, 1) to (2, 20)
inclusion directive at tt.c (3, 1) to (3, 19)
TypedefDecl at tt.c (5, 1) to (5, 57)
TypeRef at tt.c (5, 9) to (5, 16)
ParmDecl at tt.c (5, 31) to (5, 35)
ParmDecl at tt.c (5, 36) to (5, 49)
ParmDecl at tt.c (5, 50) to (5, 56)
FunctionDecl at tt.c (7, 1) to (9, 2)
ParmDecl at tt.c (7, 21) to (7, 40)
TypeRef at tt.c (7, 21) to (7, 31)
CompoundStmt at tt.c (7, 42) to (9, 2)
CallExpr at tt.c (8, 5) to (8, 42)
UnexposedExpr at tt.c (8, 5) to (8, 16)
ParenExpr at tt.c (8, 5) to (8, 16)
UnaryOperator at tt.c (8, 6) to (8, 15)
UnexposedExpr at tt.c (8, 7) to (8, 15)
DeclRefExpr at tt.c (8, 7) to (8, 15)
IntegerLiteral at tt.c (8, 17) to (8, 18)
UnexposedExpr at tt.c (8, 20) to (8, 37)
UnexposedExpr at tt.c (8, 20) to (8, 37)
StringLiteral at tt.c (8, 20) to (8, 37)
IntegerLiteral at tt.c (8, 39) to (8, 41)
FunctionDecl at tt.c (11, 1) to (13, 2)
CompoundStmt at tt.c (11, 21) to (13, 2) <- XXX no line 12?
FunctionDecl at tt.c (15, 1) to (20, 2)
CompoundStmt at tt.c (15, 12) to (20, 2)
CallExpr at tt.c (16, 5) to (16, 19)
UnexposedExpr at tt.c (16, 5) to (16, 17)
DeclRefExpr at tt.c (16, 5) to (16, 17) <- XXX no line 17?
ReturnStmt at tt.c (19, 5) to (19, 13)
IntegerLiteral at tt.c (19, 12) to (19, 13)
We can see that the three functions (direct_write at line 7/indirect_write at line 11/main at line 15) are there, most statements can be found in the AST, but i can not find anything that are representing the statements in line 12 and line 17. Does anyone know the reason?
I'm on debian 2.6.32 squeeze, tested both on clang 3.1 and 3.2 (compiled from source).
Here is the program parse_ast.c:
#include <stddef.h>
#include <stdio.h>
#include <clang-c/Index.h>
enum CXChildVisitResult visit_fn(CXCursor cr, CXCursor parent,
CXClientData client_data) {
unsigned depth;
unsigned line, column, offset;
enum CXCursorKind kind;
CXSourceRange extent;
CXSourceLocation start, end;
CXString kind_spelling, filename;
CXFile file;
depth = (unsigned)client_data;
// print cursor kind
kind = clang_getCursorKind(cr);
kind_spelling = clang_getCursorKindSpelling(kind);
fprintf(stdout, "%*s%s at", depth, " ", clang_getCString(kind_spelling));
clang_disposeString(kind_spelling);
// get extent
extent = clang_getCursorExtent(cr);
start = clang_getRangeStart(extent);
end = clang_getRangeEnd(extent);
// print start position
clang_getExpansionLocation(start, &file, &line, &column, &offset);
filename = clang_getFileName(file);
fprintf(stdout, " %s (%u, %u) to", clang_getCString(filename), line,
column);
clang_disposeString(filename);
// print end position
clang_getExpansionLocation(end, &file, &line, &column, &offset);
fprintf(stdout, " (%u, %u)\n", line, column);
// recursive
clang_visitChildren(cr, visit_fn, (CXClientData)(depth + 1));
return CXChildVisit_Continue;
}
int main(int argc, const char * const *argv) {
CXIndex Index = clang_createIndex(0, 0);
CXTranslationUnit TU = clang_parseTranslationUnit(Index, NULL,
argv, argc, 0, 0, CXTranslationUnit_DetailedPreprocessingRecord);
clang_visitChildren(clang_getTranslationUnitCursor(TU),
visit_fn, 0);
clang_disposeTranslationUnit(TU);
clang_disposeIndex(Index);
return 0;
}
Update:
the problem is due to missing a header file stddef.h, it's answered in libclang's mail list http://clang-developers.42468.n3.nabble.com/libclang-missing-some-statements-in-the-AST-td4029641.html
回答1:
Check the diagnostics generated by clang_parseTranslationUnit()
- even if errors are encountered, an AST is generated, but clearly it can't be guaranteed to be meaningful.
I found that commenting out the #include
lines resulted in compilation errors, but an AST was generated which resembled yours (specifically, line 17 was missing).
Replacing the #include
lines with typedefs for size_t
and ssize_t
(as int
) resulted in a compilation warning about the implicit declaration of write()
, but the AST included line 17.
Hence I assume there's a problem in your header files, which the diagnostics should reveal. Diagnostics can be retrieved by e.g.
for (unsigned I = 0, N = clang_getNumDiagnostics(TU); I != N; ++I) {
CXDiagnostic Diag = clang_getDiagnostic(TU, I);
CXString String = clang_formatDiagnostic(Diag, clang_defaultDiagnosticDisplayOptions());
fprintf(stderr, "%s\n", clang_getCString(String));
clang_disposeString(String);
}
回答2:
I'm using libclang to parse and optimize c code, but parsing c source files with your code i can´t see the CXCursor_BinaryOperator inside a CompoundStmt
For example
void OCTS_C_TimerMiliseconds_reset_Timers(OCTS_outC_C_TimerMiliseconds_Timers *outC)
{
outC->init = kcg_true;
/* 1 */ OCTS_Sign_INT_reset_Math(&outC->_1_Context_1);
/* 2 */ OCTS_Sign_INT_reset_Math(&outC->Context_2);
/* 1 */ OCTS_FallingEdge_reset_Edge(&outC->Context_1);
}
The result is:
FunctionDecl at s.cpp (10, 1) to (17, 2) OCTS_C_TimerMiliseconds_reset_Timers
ParmDecl at s.cpp (11, 3) to (11, 44) outC
TypeRef at s.cpp (11, 3) to (11, 38) OCTS_outC_C_TimerMiliseconds_Timers
CompoundStmt at s.cpp (12, 1) to (17, 2)
来源:https://stackoverflow.com/questions/14250754/libclang-missing-some-statements-in-the-ast