clang fails replacing a statement if it contains a macro

后端 未结 2 1963
栀梦
栀梦 2021-02-01 05:31

I\'m using clang to try and parse (with the C++ API) some C++ files and make all the case - break pairs use a specific style.

Example:

*         


        
相关标签:
2条回答
  • 2021-02-01 05:55

    Your problem is caused by the design of SourceLocation.

    An article follows:


    Macro expansion with clang's SourceLocation

    SourceLocation is designed to be flexible enough to handle both unexpanded locations and macro expanded locations at the same time.

    If the token is the result of an expansion, then there are two different locations to be kept into account: the spelling location (the location of the characters corresponding to the token) and the instantiation location (the location where the token was used - the macro instantiation point).

    Let's take the following simple source file as an example:

    #define MACROTEST bool
    
    int main() {
    
        int var = 2;
        switch(var)
        {
           case 1:
           {
              MACROTEST newvar;
           }break;
    
           case 2:
           {
              MACROTEST newvar;
              break;
           }
        }
    
        return 0;
    }
    

    and suppose we want to replace the two declarations statements

    MACROTEST newvar;
    

    with the declaration statement

    int var = 2;
    

    in order to get something like this

    #define MACROTEST bool
    
    int main() {
    
        int var = 2;
        switch(var)
        {
           case 1:
           {
              int var = 2;
           }break;
    
           case 2:
           {
              int var = 2;
              break;
           }
        }
    
        return 0;
    }
    

    if we output the AST (-ast-dump) we get the following (I'm including an image since it's more intuitive than just uncolored text):

    clang AST

    as you can see the location reported for the first DeclStmt we're interested in, spans from line 1 to 10: that means clang is reporting in the dump the interval spanning from the macro's line to the point where the macro is used:

    #define MACROTEST [from_here]bool
    
    int main() {
    
        int var = 2;
        switch(var)
        {
           case 1:
           {
              MACROTEST newvar[to_here];
           }break;
    
           case 2:
           {
              MACROTEST newvar;
              break;
           }
        }
    
        return 0;
    }
    

    (notice that the count of characters might not be the same with normal spaces since my text editor used tabs)

    Ultimately, this is triggering the Rewriter::getRangeSize failure (-1) and the subsequent Rewriter::ReplaceStmt true return value (which means failure - see documentation).

    What is happening is the following: you're receiving a couple of SourceLocation markers where the first is a macro ID (isMacroID() would return true) while the latter isn't.

    In order to successfully get the extent of the macro-expanded statement we need to take a step back and communicate with the SourceManager which is the query-gateway for all your spelling locations and instantiation locations (take a step back if you don't remember these terms) needs. I can't be more clear than the detailed description provided in the documentation:

    The SourceManager can be queried for information about SourceLocation objects, turning them into either spelling or expansion locations. Spelling locations represent where the bytes corresponding to a token came from and expansion locations represent where the location is in the user's view. In the case of a macro expansion, for example, the spelling location indicates where the expanded token came from and the expansion location specifies where it was expanded.

    At this point you should be getting why I explained all this stuff in the first place: if you intend to use source ranges for your substitution, you need to use the appropriate expansion interval.

    Back to the sample I proposed, this is the code to achieve it:

    SourceLocation startLoc = declaration_statement->getLocStart();
    SourceLocation endLoc = declaration_statement->getLocEnd();
    
    if( startLoc.isMacroID() ) {
        // Get the start/end expansion locations
        std::pair< SourceLocation, SourceLocation > expansionRange = 
                 rewriter.getSourceMgr().getImmediateExpansionRange( startLoc );
    
        // We're just interested in the start location
        startLoc = expansionRange.first;
    }
    
    if( endLoc.isMacroID() ) {
      // will not be executed
    }
    
    SourceRange expandedLoc( startLoc, endLoc );
    bool failure = rewriter.ReplaceText( expandedLoc, 
                                         replacer_statement->getSourceRange() );
    
    if( !failure )
        std::cout << "This will get printed if you did it correctly!";
    

    The declaration_statement is either one of the two

    MACROTEST newvar;
    

    while replacer_statement is the statement used for the replacement

    int var = 2;
    

    The above code will get you this:

    #define MACROTEST bool
    
    int main() {
    
        int var = 2;
        switch(var)
        {
           case 1:
           {
              int var = 2;
           }break;
    
           case 2:
           {
              int var = 2;
              break;
           }
        }
    
        return 0;
    }
    

    i.e. a complete and successful substitution of the macro-expanded statement.


    References:

    • clang documentation
    • clang doxygen API
    • clang source code
    0 讨论(0)
  • 2021-02-01 05:59

    In oder to get the file location related to the macro expansion, an API function can be used to retrieve the information:

    SourceLocation startLoc = rewriter.getSourceMgr().getFileLoc(
        declaration_statement->getLocStart());
    SourceLocation endLoc = rewriter.getSourceMgr().getFileLoc(
        declaration_statement->getLocEnd());
    

    This API function does the same as what Marco wrote in his code, but automatically.

    If we look at the implementation of the function getFileLoc():

    This is the description of the function: Given Loc, if it is a macro location return the expansion location or the spelling location, depending on if it comes from a macro argument or not.

     SourceLocation getFileLoc(SourceLocation Loc) const {
         if (Loc.isFileID()) return Loc;
             return getFileLocSlowCase(Loc);
     }
    
    SourceLocation SourceManager::getFileLocSlowCase(SourceLocation Loc) const {
        do {
            if (isMacroArgExpansion(Loc))
                Loc = getImmediateSpellingLoc(Loc);
            else
                Loc = getImmediateExpansionRange(Loc).first;
        } while (!Loc.isFileID());
        return Loc;
    }
    
    0 讨论(0)
提交回复
热议问题