How to find out whether a member function is const or volatile with libclang?

一个人想着一个人 提交于 2019-12-03 06:42:08
user1071136

I can think of two approaches:

Using the libclang lexer

The code which appears in this SO answer works for me; it uses the libclang tokenizer to break a method declaration apart, and then records any keywords outside of the method parentheses.

It does not access the AST of the code, and as far as I can tell doesn't involve the parser at all. If you are sure the code you investigate is proper C++, I believe this approach is safe.

Disadvantages: This solution does not appear to take into account preprocessing directives, so the code has to be processed first (e.g., passed through cpp).

Example code (the file to parse must be the first argument to your program, e.g. ./a.out bla.cpp):

#include "clang-c/Index.h"
#include <string>
#include <set>
#include <iostream>

std::string GetClangString(CXString str)
{
  const char* tmp = clang_getCString(str);
  if (tmp == NULL) {
    return "";
  } else {
    std::string translated = std::string(tmp);
    clang_disposeString(str);
    return translated;
  }
}

void GetMethodQualifiers(CXTranslationUnit translationUnit,
                         std::set<std::string>& qualifiers,
                         CXCursor cursor) {
  qualifiers.clear();

  CXSourceRange range = clang_getCursorExtent(cursor);
  CXToken* tokens;
  unsigned int numTokens;
  clang_tokenize(translationUnit, range, &tokens, &numTokens);

  bool insideBrackets = false;
  for (unsigned int i = 0; i < numTokens; i++) {
    std::string token = GetClangString(clang_getTokenSpelling(translationUnit, tokens[i]));
    if (token == "(") {
      insideBrackets = true;
    } else if (token == "{" || token == ";") {
      break;
    } else if (token == ")") {
      insideBrackets = false;
    } else if (clang_getTokenKind(tokens[i]) == CXToken_Keyword && 
             !insideBrackets) {
      qualifiers.insert(token);
    }
  }

  clang_disposeTokens(translationUnit, tokens, numTokens);
}

int main(int argc, char *argv[]) {
  CXIndex Index = clang_createIndex(0, 0);
  CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, 
          argv, argc, 0, 0, CXTranslationUnit_None);

  // Set the file you're interested in, and the code location:
  CXFile file = clang_getFile(TU, argv[1]);
  int line = 5;
  int column = 6;
  CXSourceLocation location = clang_getLocation(TU, file, line, column);
  CXCursor cursor = clang_getCursor(TU, location);

  std::set<std::string> qualifiers;
  GetMethodQualifiers(TU, qualifiers, cursor);

  for (std::set<std::string>::const_iterator i = qualifiers.begin(); i != qualifiers.end(); ++i) {
    std::cout << *i << std::endl;
  }

  clang_disposeTranslationUnit(TU);
  clang_disposeIndex(Index);
  return 0;
}

Using libclang's Unified Symbol Resolution (USR)

This approach involves using the parser itself, and extracting qualifier information from the AST.

Advantages: Seems to work for code with preprocessor directives, at least for simple cases.

Disadvantages: My solution parses the USR, which is undocumented, and might change in the future. Still, it's easy to write a unit-test to guard against that.

Take a look at $(CLANG_SRC)/tools/libclang/CIndexUSRs.cpp, it contains the code that generates a USR, and therefore contains the information required to parse the USR string. Specifically, lines 523-529 (in LLVM 3.1's source downloaded from www.llvm.org) for the qualifier part.

Add the following function somewhere:

void parseUsrString(const std::string& usrString, bool* isVolatile, bool* isConst, bool *isRestrict) {
  size_t bangLocation = usrString.find("#");
  if (bangLocation == std::string::npos || bangLocation == usrString.length() - 1) {
    *isVolatile = *isConst = *isRestrict = false;
    return;
  }
  bangLocation++;
  int x = usrString[bangLocation];

  *isConst = x & 0x1;
  *isVolatile = x & 0x4;
  *isRestrict = x & 0x2;
}

and in main(),

CXString usr = clang_getCursorUSR(cursor);
const char *usr_string = clang_getCString(usr);
std::cout << usr_string << "\n";
bool isVolatile, isConst, isRestrict;
parseUsrString(usr_string, &isVolatile, &isConst, &isRestrict);
printf("restrict, volatile, const: %d %d %d\n", isRestrict, isVolatile, isConst);
clang_disposeString(usr);

Running on Foo::qux() from

#define BLA const

class Foo {
public:
    void bar() const;
    void baz() volatile;
    void qux() BLA volatile;
};

produces the expected result of

c:@C@Foo@F@qux#5
restrict, volatile, const: 0 1 1

Caveat: you might have noticed that libclang's source suggets my code should be isVolatile = x & 0x2 and not 0x4, so it might be the case you should replace 0x4 with 0x2. It's possible my implementation (OS X) has them replaced.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!