Using SIMD in a Game Engine Math Library by using function pointers ~ A good idea?

荒凉一梦 提交于 2019-12-23 19:15:18

问题


I have been reading Game Engine Books since I was 14 (At that time I didn't understand a thing:P) Now quite some years later I wanted to start programming the Mathmatical Basis for my Game Engine. I've been thinking long about how to design this 'library'. (Which I mean as "Organized set of files") Every few years new SIMD instructionsets come out, and I wouldn't want them to go to waste. (Tell me if I am wrong about this.)

I wanted to at least have the following properties:

  • Making it able to check if it has SIMD at runtime, and use SIMD if it has it and uses the normal C++ version if it doesn't. (Might have some call overhead, is this worth it?)
  • Making it able to compile for SIMD or normal C++ if we already know the target at compile time. The calls can get inlined and made suitable for Cross-Optimisation because the compiler knows if SIMD or C++ is used.

EDIT - I want to make the sourcecode portable so it can run on other deviced then x86(-64) too

So I thought it would be a good solution to use function pointers which I would make static and initialize at the start of the program. And which the suited functions(For example multiplication of Matrix/Vector) will call.

What do you think are the advantages and disadvantages(Which outweights more?) of this design and is it even possible to create it with both properties as described above?

Christian


回答1:


It's important to get the right granularity at which you make decision on which routine to call. If you do this at too low a level then function dispatch overhead becomes a problem, e.g. a small routine which just has a few instructions could become very inefficient if called via some kind of function pointer dispatch mechanism rather than say just being inlined. Ideally the architecture-specific routines should be processing a reasonable amount of data so that function dispatch cost is negligible, without being so large that you get significant code bloat due to compiling additional non-architecture-specific code for each supported architecture.




回答2:


The simplest way to do this is to compile your game twice, once with SIMD enabled, once without. Create a tiny little launcher app that performs the _may_i_use_cpu_feature checks, and then runs the correct build.

The double indirection caused by calling a matrix multiply (for example) via a function pointer is not going to be nice. Instead of inlining trivial maths functions, it'll introduce function calls all over the shop, and those calls will be forced to save/restore a lot of registers to boot (because the code behind the pointer is not going to be know until runtime).

At that point, a non-optimised version without the double indirection will massively outperform the SSE version with function pointers.

As for supporting multiple platforms, this can be easy, and it can also be a real bother. ARM neon is similar enough to SSE4 to make it worth wrapping the intructions behind some macros, however neon is also different enough to be really annoying!

#if CPU_IS_INTEL

#include <immintrin.h>
typedef __m128 f128;

#define add4f _mm_add_ps

#else

#include <neon.h>
typedef float32x4 f128;

#define add4f vqadd_f32

#endif

The MAJOR problem with starting on say Intel, and porting to ARM later is that a lot of the nice things don't exist. Shuffling is possible on ARM, but it's also a bother. Division, dot product, and sqrt don't exist on ARM (only reciprocal estimates, which you'll need to do your own newton iteration on)

If you are thinking about SIMD like this:

struct Vec4 
{
  float x;
  float y;
  float z;
  float w;
};

Then you may just be able to wrap SSE and NEON behind a semi-ok wrapper. When it comes to AVX512 and AVX2 though, you'll probably be screwed.

If however you are thinking about SIMD using structure-of-array formats:

struct Vec4SOA
{
  float x[BIG_NUM];
  float y[BIG_NUM];
  float z[BIG_NUM];
  float w[BIG_NUM];
};

Then there is a chance you'll be able to produce an AVX2/AVX512 version. However, working with code organised like that is not the easiest thing in the world.



来源:https://stackoverflow.com/questions/16478514/using-simd-in-a-game-engine-math-library-by-using-function-pointers-a-good-ide

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!