Clang: How to get the macro name used for size of a constant size array declaration

邮差的信 提交于 2020-01-14 01:38:32

问题


TL;DR;

How to get the macro name used for size of a constant size array declaration, from a callExpr -> arg_0 -> DeclRefExpr.

Detailed Problem statement:

Recently I started working on a challenge which requires source to source transformation tool for modifying specific function calls with an additional argument. Reasearching about the ways i can acheive introduced me to this amazing toolset Clang. I've been learning how to use different tools provided in libtooling to acheive my goal. But now i'm stuck at a problem, seek your help here.

Considere the below program (dummy of my sources), my goal is to rewrite all calls to strcpy function with a safe version of strcpy_s and add an additional parameter in the new function call i.e - destination pointer maximum size. so, for the below program my refactored call would be like strcpy_s(inStr, STR_MAX, argv[1]);

I wrote a RecursiveVisitor class and inspecting all function calls in VisitCallExpr method, to get max size of the dest arg i'm getting VarDecl of the first agrument and trying to get the size (ConstArrayType). Since the source file is already preprocessed i'm seeing 2049 as the size, but what i need is the macro STR_MAX in this case. how can i get that? (Creating replacements with this info and using RefactoringTool replacing them afterwards)

#include <stdio.h>
#include <string.h>
#include <stdlib.h> 

#define STR_MAX 2049

int main(int argc, char **argv){
  char inStr[STR_MAX];

  if(argc>1){
    //Clang tool required to transaform the below call into strncpy_s(inStr, STR_MAX, argv[1], strlen(argv[1]));
    strcpy(inStr, argv[1]);
  } else {
    printf("\n not enough args");
    return -1;
  }

  printf("got [%s]", inStr);

  return 0;
}

回答1:


As you noticed correctly, the source code is already preprocessed and it has all the macros expanded. Thus, the AST will simply have an integer expression as the size of array.

A little bit of information on source locations

NOTE: you can skip it and proceed straight to the solution below

The information about expanded macros is contained in source locations of AST nodes and usually can be retrieved using Lexer (Clang's lexer and preprocessor are very tightly connected and can be even considered one entity). It's a bare minimum and not very obvious to work with, but it is what it is.

As you are looking for a way to get the original macro name for a replacement, you only need to get the spelling (i.e. the way it was written in the original source code) and you don't need to carry much about macro definitions, function-style macros and their arguments, etc.

Clang has two types of different locations: SourceLocation and CharSourceLocation. The first one can be found pretty much everywhere through the AST. It refers to a position in terms of tokens. This explains why begin and end positions can be somewhat counterintuitive:

// clang::DeclRefExpr
//
//  ┌─ begin location
foo(VeryLongButDescriptiveVariableName);
//  └─ end location
// clang::BinaryOperator
//
//           ┌─ begin location
int Result = LHS + RHS;
//                 └─ end location

As you can see, this type of source location points to the beginning of the corresponding token. CharSourceLocation on the other hand, points directly to the characters.

So, in order to get the original text of the expression, we need to convert SourceLocation's to CharSourceLocation's and get the corresponding text from the source.

The solution

I've modified your example to show other cases of macro expansions as well:

#define STR_MAX 2049
#define BAR(X) X

int main() {
  char inStrDef[STR_MAX];
  char inStrFunc[BAR(2049)];
  char inStrFuncNested[BAR(BAR(STR_MAX))];
}

The following code:

// clang::VarDecl *VD;
// clang::ASTContext *Context;
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto DeclarationType = VD->getTypeSourceInfo()->getTypeLoc();

if (auto ArrayType = DeclarationType.getAs<ConstantArrayTypeLoc>()) {
  auto *Size = ArrayType.getSizeExpr();

  auto CharRange = Lexer::getAsCharRange(Size->getSourceRange(), SM, LO);
  // Lexer gets text for [start, end) and we want him to grab the end as well
  CharRange.setEnd(CharRange.getEnd().getLocWithOffset(1));

  auto StringRep = Lexer::getSourceText(CharRange, SM, LO);
  llvm::errs() << StringRep << "\n";
}

produces this output for the snippet:

STR_MAX
BAR(2049)
BAR(BAR(STR_MAX))

I hope this information is helpful. Happy hacking with Clang!



来源:https://stackoverflow.com/questions/56512050/clang-how-to-get-the-macro-name-used-for-size-of-a-constant-size-array-declarat

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!