C versus vDSP versus NEON - How could NEON be as slow as C?
How could NEON be as slow as C? I have been trying to build a fast Histogram function that would bucket incoming values into ranges by assigning them a value - which is the range threshold they are closest to. This is something that would be applied to images so it would have to be fast (assume an image array of 640x480 so 300,000 elements) . The histogram range numbers are multiples (0,25,50,75,100) . Inputs would be float and final outputs would obviously be integers I tested the following versions on xCode by opening a new empty project (no app delegate) and just using the main.m file. I