sse/avx equivalent for neon vuzp
问题 Intel's vector extensions SSE, AVX, etc. provide two unpack operations for each element size, e.g. SSE intrinsics are _mm_unpacklo_* and _mm_unpackhi_* . For 4 elements in a vector, it does this: inputs: (A0 A1 A2 A3) (B0 B1 B2 B3) unpacklo/hi: (A0 B0 A1 B1) (A2 B2 A3 B3) The equivalent of unpack is vzip in ARM's NEON instruction set. However, the NEON instruction set also provides the operation vuzp which is the inverse of vzip . For 4 elements in a vector, it does this: inputs: (A0 A1 A2 A3