Find all references of specific function declaration in libclang (Python)

我的未来我决定 提交于 2019-12-03 13:23:26

The thing that really makes this problem challenging is the complexity of C++.

Consider what is callable in C++: functions, lambdas, the function call operator, member functions, template functions and member template functions. So in the case of just matching call expressions, you'd need to be able to disambiguate these cases.

Furthermore, libclang doesn't offer a perfect view of the clang AST (some nodes don't get exposed completely, particularly some nodes related to templates). Consequently, it's possible (even likely) that an arbitrary code fragment would contain some construct where libclangs view of the AST was insufficient to associate the call expression with a declaration.

However, if you're prepared to restrict yourself to a subset of the language it may be possible to make some headway - for example, the following sample tries to associate call sites with function declarations. It does this by doing a single pass over all the nodes in the AST matching function declarations with call expressions.

from clang.cindex import *

def is_function_call(funcdecl, c):
    """ Determine where a call-expression cursor refers to a particular function declaration
    """
    defn = c.get_definition()
    return (defn is not None) and (defn == funcdecl)

def fully_qualified(c):
    """ Retrieve a fully qualified function name (with namespaces)
    """
    res = c.spelling
    c = c.semantic_parent
    while c.kind != CursorKind.TRANSLATION_UNIT:
        res = c.spelling + '::' + res
        c = c.semantic_parent
    return res

def find_funcs_and_calls(tu):
    """ Retrieve lists of function declarations and call expressions in a translation unit
    """
    filename = tu.cursor.spelling
    calls = []
    funcs = []
    for c in tu.cursor.walk_preorder():
        if c.location.file is None:
            pass
        elif c.location.file.name != filename:
            pass
        elif c.kind == CursorKind.CALL_EXPR:
            calls.append(c)
        elif c.kind == CursorKind.FUNCTION_DECL:
            funcs.append(c)
    return funcs, calls

idx = Index.create()
args =  '-x c++ --std=c++11'.split()
tu = idx.parse('tmp.cpp', args=args)
funcs, calls = find_funcs_and_calls(tu)
for f in funcs:
    print(fully_qualified(f), f.location)
    for c in calls:
        if is_function_call(f, c):
            print('-', c)
    print()

To show how well this works, you need a slightly more challenging example to parse:

// tmp.cpp
#include <iostream>
using namespace std;

namespace impl {
    int addition(int x, int y) {
        return x + y;
    }

    void f() {
        addition(2, 3);
    }
}

int addition (int a, int b) {
  int r;
  r=a+b;
  return r;
}

int main () {
  int z, q;
  z = addition (5,3);
  q = addition (5,5);
  cout << "The first result is " << z;
  cout << "The second result is " << q;
}

And I get the output:

impl::addition
- <SourceLocation file 'tmp.cpp', line 10, column 9>

impl::f

addition
- <SourceLocation file 'tmp.cpp', line 22, column 7>
- <SourceLocation file 'tmp.cpp', line 23, column 7>

main

Scaling this up to consider more types of declarations would (IMO) be non-trivial and an interesting project in it's own right.

Addressing comments

Given that there are some questions about whether the code in this answer produces the results I've provided, I've added a gist of the code (that reproduces the content of this question) and a very minimal vagrant machine image that you can use to experiment with. Once the machine is booted you can clone the gist, and reproduce the answer with the commands:

git clone https://gist.github.com/AndrewWalker/daa2af23f34fe9a6acc2de579ec45535 find-func-decl-refs
cd find-func-decl-refs
export LD_LIBRARY_PATH=/usr/lib/llvm-3.8/lib/ && python3 main.py
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!