Is there any way or tool to apply GPU acceleration on compiling programs with GCC compiler? Right now I have created a program to compile the given list of programs iterativ
IF what you are asking is, "Can you automatically write GPU-accelerated code for use with GCC and LLVM?" the answer is yes. NVIDIA and Google make open-source LLVM-based compiler projects:
NVIDIA CUDA LLVM:
GOOGLE GPUCC:
If your question is, "can I use the GPU to speed up non-CUDA generic code compilation?" the answer is currently no. The GPU is good at certain things like parallel tasks, bad at others like branches which compilers are all about. The good news is, you can use a network of PCs with CPUs to get 2-10x compile speedups, depending on how optimized your code is already, and you can get the fastest multi-core CPU and high-speed SSD available for your desktop to get gains for less hassle before you resort to network builds.
There are tools to distribute C/C++/ObjC compiler tasks to a network of computers like Distcc. It was included in older versions of XCode but has been removed, and there's no support for using it with Swift.
There is a commercial tool similar to Distcc called Incredibuild that supports Visual Studio C/C++ and Linux development environments:
There are some good articles about real-world use of Incredibuild vs Distcc and tradeoffs compared to the incremental build support in the native compiler for making small changes like a single line in a single file without recompiling everything else. Points to consider:
A. In an imperative programming language, statements are executed in sequence, and each statement may change the program's state. So analyzing translation units is inherently sequential.
An example: Check out how constant propagation might work -
a = 5;
b = a + 7;
c = a + b + 9;
You need to go through those statements sequentially before you figure out that the values assigned to b
and c
are constants at compile time.
(However separate basic blocks may possibly be compiled and optimized in parallel with each other.)
B. On top of this, different passes need to execute sequentially as well, and affect each other.
An example: Based on a schedule of instructions, you allocate registers, then you find that you need to spill a register to memory, so you need to generate new instructions. This changes the schedule again.
So you can't execute 'passes' like 'register allocation' and 'scheduling' in parallel either (actually, I think there are articles where computer scientists/mathematicians have tried to solve these two problems together, but lets not go into that).
(Again, one can achieve some parallelism by pipelining passes.)
Moreover, GPUs especially don't fit because:
GPUs are good at floating point math. Something compilers don't need or use much (except when optimizing floating point arithmetic in the program)
GPUs are good at SIMD. i.e. performing the same operation on multiple inputs. This is again, not something a compiler needs to do. There may be a benefit if the compiler needs to, say, optimize several hundred floating point operations away (A wild example would be: the programmer defined several large FP arrays, assigned constants to them, and then wrote code to operate on these. A very badly written program indeed.)
So apart from parallelizing compilation of basic blocks and pipelining passes, there is not much parallelism to be had at the level of 'within compilation of a C file'. But parallelism is possible, easy to implement, and constantly used at a higher level. GNU Make
, for example, has the -j=N
argument. Which basically means: As long as it finds N
independent jobs (usually, compiling a bunch of files is what GNU Make
is used for anyway), it spawns N
processes (or N
instances of gcc
compiling different files in parallel).