In the last couple of years, I\'ve been doing a lot of SIMD programming and most of the time I\'ve been relying on compiler intrinsic functions (such as the ones for SSE pro
Currently the best solution is to do it myself by creating a back-end for the open-source Cg frontend that Nvidia released, but I'd like to save myself the effort so I'm curious if it's been done before. Preferably I'd start using it right away.
It's a library for C++, rather than built into the language, but Eigen is pretty invisible once your variables are declared.
Your best bet is probably OpenCL. I know it has mostly been hyped as a way to run code on GPUs, but OpenCL kernels can also be compiled and run on CPUs. OpenCL is basically C with a few restrictions:
and a bunch of additions. In particular vector types:
float4 x = float4(1.0f, 2.0f, 3.0f, 4.0f);
float4 y = float4(10.0f, 10.0f, 10.0f, 10.0f);
float4 z = y + x.s3210 // add the vector y with a swizzle of x that reverses the element order
On big caveat is that the code has to be cleanly sperable, OpenCL can't call out to arbitrary libraries, etc. But if your compute kernels are reasonably independent then you basically get a vector enhanced C where you don't need to use intrinsics.
Here is a quick reference/cheatsheet with all of the extensions.
It's not really the language itself, but there is a library for Mono (Mono.Simd) that will expose the vectors to you and optimise the operations on them into SSE whenever possible:
So recently Intel released ISPC which is exactly what I was looking for when asking this question. It's a language that can link with normal C code, has and implicit execution model, and support for all the features mentioned in the start post (swizzle operators, branching, data structs, vector ops, shader like) and compiles for SSE2, SSE4, AVX, AVX2, and Xeon Phi vector instructions.
That would be Fortran that you are looking for. If memory serves even the open-source compilers (g95, gfortran) will take advantage of SSE if it's implemented on your hardware.