OpenCL LLVM IR generation from Clang

邮差的信 提交于 2019-12-04 19:23:48

The command line

clang ... -emit-llvm ...

runs the Clang driver, which first runs the Clang front-end to generate LLVM IR, then runs LLVM to process the IR, then emits the processed IR.

The code

ParseAST(mCompilerInst.getPreprocessor(),llvmCodeGen,mCompilerInst.getASTContext());

on the other hand, only parses the AST into the IR file and then nothing further is done with it. You haven't invoked LLVM at all, only the Clang front-end, and that's why you're seeing different code. It actually has the same semantic* - it's just unoptimized.

To solve this, you need to actually run some passes (or better yet, use a pass manager) on your module. You can take a look at chapter 4 of the kaleidoscope tutorial for help on this.


* Actually the two modules aren't equivalent. The 1st code is equivalent to:

kernel void vector_add(global float* vec1, global float* vec2, global float* vec3) {
    int i = get_global_id(0);
    vec3[i] = (vec1[1] + vec2[2]) * 5f;
}

While the 2nd is to:

kernel void vector_add(global float* vec1, global float* vec2, global float* vec3) {
    int i = get_global_id(0);
    vec3[i] = (vec1[i] + vec2[i]) * 5f;
}

Notice the different indices used (i vs 1 and 2) - I'm guessing the 2nd version is what you want. If it's not a simple copy-paste error, I suggest you re-check your flow to verify you are working on the correct file.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!