问题
I current run BOINC across a number of servers which have GPUs.
The servers run both GPU and CPU BOINC apps.
As AVX and SSE slow down the CPU freq when being used within a CPU app, I have to be selective which CPU/GPU I run together, as some GPU apps get bottle necked (slower run time completion) where as others do not.
At present some CPU apps are named so it is clear to see if they use AVX but most are not.
Therefore is there any command I can run, and some way of viewing, to see if any of the CPU apps currently running are using AVX or SSE (any versions)?
Also as a side note, should I treat any FMA usage in the same way (eg does it slow down the CPU freq due to increased CPU temps)?
Thanks
回答1:
You can use perf top to see the number of AVX and SSE instructions executed in real-time along with executable and shared library names:
perf top -e fp_arith_inst_retired.128b_packed_single -e fp_arith_inst_retired.128b_packed_double -e fp_arith_inst_retired.256b_packed_single -e fp_arith_inst_retired.256b_packed_double
Counter descriptions (from perf list output on Intel Coffee Lake CPU):
floating point:
fp_arith_inst_retired.128b_packed_double
[Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.128b_packed_single
[Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.256b_packed_double
[Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.256b_packed_single
[Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.scalar_double
[Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double
precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element]
fp_arith_inst_retired.scalar_single
[Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single
precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations
per element]
fp_assist.any
[Cycles with any input/output SSE or FP assist]
来源:https://stackoverflow.com/questions/60329437/ubuntu-how-to-tell-if-avx-or-sse-is-current-being-used-by-cpu-app