I\'m trying to build a library for a Cortex A9 ARM processor(an OMAP4 to be more specific) and I\'m in a little bit of confusion regarding which\\when to use NEON vs VFP in the
I think this question should be split up into several, adding some code examples and detailing target platform and versions of toolchains used.
But to cover one part of confusion: The recommendation to "use NEON as the FPU" sounds like a misunderstanding. NEON is a SIMD engine, the VFP is an FPU. You can use NEON for single-precision floating-point operations on up to 4 single-precision values in parallel, which (when possible) is good for performance.
-mfpu=neon
can be seen as shorthand for -mfpu=neon-vfpv3
.
See http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html for more information.