arm neon compare operations generate negative one

后端 未结 2 1510
灰色年华
灰色年华 2021-01-22 07:00

I am trying the following assembly code:

vclt.f32 q9,q0,#0
vst1.i32 q9,[r2:128]

But if the condition is true, the corresponding element in q9 i

2条回答
  •  不思量自难忘°
    2021-01-22 07:32

    There's not a lot of conditional stuff in NEON, but what there is is really only workable with bitwise, rather than Boolean, logic - see e.g. vbsl.

    If you have horrible memories of BASIC and really hate bitwise truth values, then the trivial way to convert the mask to a Boolean is to just take the top bit of each element:

    vshr.u32 q9, q9, #31
    

    Although negation, whilst arguably less clear to read at a glance, could be microscopically better performance-wise in some cases:

    vneg.s32 q9, q9
    

    (from a browse through microarchitectural timings, both operations are pretty much identical, but some theoretical advantages of vneg over vshr are that it consumes its inputs later on Cortex-A8, and can issue down both ASIMD pipes of Cortex-A57/A72)

    Either way, as said at the top, this only really makes sense for storing the result back to memory to be looked at by non-vectorised code.

提交回复
热议问题