I am trying to compute sum of large array in parallel with metal swift.
Is there a god way to do it?
My plane was that I divide my array to sub arrays, compute
i've been running the app. on a gt 740 (384 cores) vs. i7-4790 with a multithreader vector sum implementation and here are my figures:
Metal lap time: 19.959092
cpu MT lap time: 4.353881
that's a 5/1 ratio for cpu, so unless you have a powerful gpu using shaders is not worth it.
i've been testing the same code in a i7-3610qm w/ igpu intel hd 4000 and surprisely results are much better for metal: 2/1
edited: after tweaking with thread parameter i've finally improved gpu performance, now it's upto 16xcpu