How to OR all lane of a NEON vector

自古美人都是妖i 提交于 2020-01-25 07:49:06

问题



I want to use NEON intrinsics to optimize the following code.

uint32x4_t c1; // 4 elements, each element is 0 or 1
uint32x4_t c2; // 4 elements, each element is 0 or 1
uint8_t pack = 0; // unsigned char, for result

/* some code /*

// need optimizing
pack |= (vgetq_lane_u32(c1, 0);
pack |= (vgetq_lane_u32(c1, 1) << 1;
pack |= (vgetq_lane_u32(c1, 2) << 2;
pack |= (vgetq_lane_u32(c1, 3) << 3;


pack |= (vgetq_lane_u32(c2, 0) << 4;
pack |= (vgetq_lane_u32(c2, 1) << 5;
pack |= (vgetq_lane_u32(c2, 2) << 6;
pack |= (vgetq_lane_u32(c2, 3) << 7;

I think need some intrinsics to OR all lanes of a vector. Could anybody give me some hints ?


回答1:


You can shift each element within a vector by individual amount of bits.

const int32x4_t shifter1 = {0, 1, 2, 3};
const int32x4_t shifter2 = {4, 5, 6, 7};
.
.
.
c1 = vshlq_u32(c1, shifter1);
c2 = vshlq_u32(c2, shifter2);

c1 = vorrq_u32(c1, c2);
pack |= vgetq_lane_u32(c1, 0) | vgetq_lane_u32(c1, 1) | vgetq_lane_u32(c1, 2) | vgetq_lane_u32(c1, 3);

That should do the trick, and the last line is up to the quality of your compiler



来源:https://stackoverflow.com/questions/49506114/how-to-or-all-lane-of-a-neon-vector

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!