How to find out whether a member function is const or volatile with libclang?

后端 未结 1 838
南方客
南方客 2021-02-05 19:22

I have an instance of CXCursor of kind CXCursor_CXXMethod. I want to find out if the function is const or volatile, for examp

1条回答
  •  囚心锁ツ
    2021-02-05 20:06

    I can think of two approaches:

    Using the libclang lexer

    The code which appears in this SO answer works for me; it uses the libclang tokenizer to break a method declaration apart, and then records any keywords outside of the method parentheses.

    It does not access the AST of the code, and as far as I can tell doesn't involve the parser at all. If you are sure the code you investigate is proper C++, I believe this approach is safe.

    Disadvantages: This solution does not appear to take into account preprocessing directives, so the code has to be processed first (e.g., passed through cpp).

    Example code (the file to parse must be the first argument to your program, e.g. ./a.out bla.cpp):

    #include "clang-c/Index.h"
    #include 
    #include 
    #include 
    
    std::string GetClangString(CXString str)
    {
      const char* tmp = clang_getCString(str);
      if (tmp == NULL) {
        return "";
      } else {
        std::string translated = std::string(tmp);
        clang_disposeString(str);
        return translated;
      }
    }
    
    void GetMethodQualifiers(CXTranslationUnit translationUnit,
                             std::set& qualifiers,
                             CXCursor cursor) {
      qualifiers.clear();
    
      CXSourceRange range = clang_getCursorExtent(cursor);
      CXToken* tokens;
      unsigned int numTokens;
      clang_tokenize(translationUnit, range, &tokens, &numTokens);
    
      bool insideBrackets = false;
      for (unsigned int i = 0; i < numTokens; i++) {
        std::string token = GetClangString(clang_getTokenSpelling(translationUnit, tokens[i]));
        if (token == "(") {
          insideBrackets = true;
        } else if (token == "{" || token == ";") {
          break;
        } else if (token == ")") {
          insideBrackets = false;
        } else if (clang_getTokenKind(tokens[i]) == CXToken_Keyword && 
                 !insideBrackets) {
          qualifiers.insert(token);
        }
      }
    
      clang_disposeTokens(translationUnit, tokens, numTokens);
    }
    
    int main(int argc, char *argv[]) {
      CXIndex Index = clang_createIndex(0, 0);
      CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, 
              argv, argc, 0, 0, CXTranslationUnit_None);
    
      // Set the file you're interested in, and the code location:
      CXFile file = clang_getFile(TU, argv[1]);
      int line = 5;
      int column = 6;
      CXSourceLocation location = clang_getLocation(TU, file, line, column);
      CXCursor cursor = clang_getCursor(TU, location);
    
      std::set qualifiers;
      GetMethodQualifiers(TU, qualifiers, cursor);
    
      for (std::set::const_iterator i = qualifiers.begin(); i != qualifiers.end(); ++i) {
        std::cout << *i << std::endl;
      }
    
      clang_disposeTranslationUnit(TU);
      clang_disposeIndex(Index);
      return 0;
    }
    

    Using libclang's Unified Symbol Resolution (USR)

    This approach involves using the parser itself, and extracting qualifier information from the AST.

    Advantages: Seems to work for code with preprocessor directives, at least for simple cases.

    Disadvantages: My solution parses the USR, which is undocumented, and might change in the future. Still, it's easy to write a unit-test to guard against that.

    Take a look at $(CLANG_SRC)/tools/libclang/CIndexUSRs.cpp, it contains the code that generates a USR, and therefore contains the information required to parse the USR string. Specifically, lines 523-529 (in LLVM 3.1's source downloaded from www.llvm.org) for the qualifier part.

    Add the following function somewhere:

    void parseUsrString(const std::string& usrString, bool* isVolatile, bool* isConst, bool *isRestrict) {
      size_t bangLocation = usrString.find("#");
      if (bangLocation == std::string::npos || bangLocation == usrString.length() - 1) {
        *isVolatile = *isConst = *isRestrict = false;
        return;
      }
      bangLocation++;
      int x = usrString[bangLocation];
    
      *isConst = x & 0x1;
      *isVolatile = x & 0x4;
      *isRestrict = x & 0x2;
    }
    

    and in main(),

    CXString usr = clang_getCursorUSR(cursor);
    const char *usr_string = clang_getCString(usr);
    std::cout << usr_string << "\n";
    bool isVolatile, isConst, isRestrict;
    parseUsrString(usr_string, &isVolatile, &isConst, &isRestrict);
    printf("restrict, volatile, const: %d %d %d\n", isRestrict, isVolatile, isConst);
    clang_disposeString(usr);
    

    Running on Foo::qux() from

    #define BLA const
    
    class Foo {
    public:
        void bar() const;
        void baz() volatile;
        void qux() BLA volatile;
    };
    

    produces the expected result of

    c:@C@Foo@F@qux#5
    restrict, volatile, const: 0 1 1
    

    Caveat: you might have noticed that libclang's source suggets my code should be isVolatile = x & 0x2 and not 0x4, so it might be the case you should replace 0x4 with 0x2. It's possible my implementation (OS X) has them replaced.

    0 讨论(0)
提交回复
热议问题