What is the most efficient way to clear a single or a few ZMM registers on Knights Landing?

后端 未结 3 1649
旧时难觅i
旧时难觅i 2021-01-12 07:07

Say, I want to clear 4 zmm registers.

Will the following code provide the fastest speed?

vpxorq  zmm0, zmm0, zmm0
vpxorq  zmm1, zmm1, zmm1
vpxorq  zm         


        
3条回答
  •  无人共我
    2021-01-12 07:44

    I put together a simple C test program using intrinsics and compiled with ICC 17 - the generated code I get for zeroing 4 zmm registers (at -O3) is:

        vpxord    %zmm3, %zmm3, %zmm3                           #7.21
        vmovaps   %zmm3, %zmm2                                  #8.21
        vmovaps   %zmm3, %zmm1                                  #9.21
        vmovaps   %zmm3, %zmm0                                  #10.21
    

提交回复
热议问题