问题
I have a quite large opencl file that compiles fine on both Windows and Linux Ubuntu but fails on MacOSX. The cvmcompiler process uses 100% of the CPU and never completes. The full code of the project is there:
https://github.com/favreau/Sol-R
and the file in question is:
https://github.com/favreau/Sol-R/blob/master/solr/engines/opencl/RayTracer.cl
The problem should be fairly easy to reproduce by cloning the project and running the cmake/make process. Note that since OpenCL is compiled at runtime, the application needs to be started to reproduce the issue.
回答1:
Was it with Iris Pro GPU and on OS X 10.11 (but works on 10.12)? Because we ran into a similar issue last week with those conditions. The compile call would hang for a few minutes, use tons of memory, and then return a not useful error code. Worked fine with other GPUs, and on 10.12, so felt like an Intel/Apple compiler bug. The kernel was relatively simple; it was a chain of if/else conditions. Each has a few conditions it checked, using logical AND operators (&&
). Based on a tip from Intel years ago, we recalled that those operators in C are "short-circuit" which means you are semantically creating many possible branches (even though the compiler really should realize there are no side effects but apparently not always).
Our solution was to pull those out into a series of boolean assignments, so each if had a single boolean and no branching around the if and else blocks. So, changing code along these lines:
if (cond1 && cond2 && cond3)
...
else if (cond4 && cond5 && cond6)
...
To
bool b1 = cond1 && cond2 && cond3;
bool b2 = cond4 && cond5 && cond6;
if (b1)
...
else if (b2)
...
And this allowed the kernel to compile.
I see your kernel has some if statements with upwards of three &&
operators, so maybe you are having the same problem.
I have also seen this solved using &
instead of &&
but I never felt comfortable using a bitwise AND instead of a logical AND just in case some of the conditions were not the same bit pattern.
The same logic applies to ||
, which also short-circuits.
Good luck! I hope this helps you, or at least someone.
Edit: To give additional credit where due, while Intel mentioned short-circuiting to us as an issue (and suggested using &
), AMD also mentions it in their OpenCL Optimization Guide and suggests the use of boolean variables to fix it (section 2.8.7.2 Bypass Short-Circuiting), which is what we used to fix it.
回答2:
I applied the suggested solution at that indeed fixed the problem, at least for my Intel i5 CPU device. I am not getting an out-of-memory error on the Intel GPU device but I beleive this is not related to the current issue.
The changes I made can be seen in the following commit:
https://github.com/favreau/Sol-R/commit/556b1c7dd255a8f5fe34e75de3b8c2a127f25c36#diff-b91517d8e9eca9cf57ecd8cf4143a935
In any case, thank you so much for the solution, this one was tricky!
来源:https://stackoverflow.com/questions/45733155/opencl-files-fail-to-compile-on-os-x