问题
This is really taking my time. I could not find a simple way to estimate FLOPS for a following code (the loop), How much FLOPS are for a single iteration of the loop:
float func(float * atominfo, float energygridItem, int xindex, int yindex)
{
...
for (atomid=0; atomid<numatoms*4; atomid+=4)
{
float dy = coory - atominfo[atomid+2];
float dysqpdzsq = (dy * dy) + atominfo[atomid+3];
float dx1 = coorx1 - atominfo[atomid+1];
float s, y, t;
s = atominfo[atomid] * (1.0f / sqrtf(dx1*dx1 + dysqpdzsq));
y = s - energycomp1;
t = energyvalx1 + y;
energycomp1 = (t - energyvalx1) - y;
energyvalx1 = t;
}
...
}
It looks simple but I got confused with some other numbers given earlier, so it would be great if someone can give an exact number.
Thanks.
回答1:
I see (in order of increasing complexity):
- 8 additions (inc. subtractions)
- 3 multiplications
- 1 reciprocal-square-root
How these relate to each other depends strongly on the CPU family.
回答2:
Try to either take intermediate assembly code or decompile exe.
Then count all floating point operations (in x86 assembly code they start with F
prefix like FSIN
).
回答3:
I count 12 plus a sqrt (which is likely using Newton's method, which is a loop), but that depends on the data types of some variables that you did not specify, and the result of compilation (which may add more, or optimize out some operations).
I am counting each +, /, -, or * where the expression contains at least one floating point variable, so array indices and the loop invariant do not count, and those are integer operations.
回答4:
Try using a performance measurement library like PAPI, they give abstractions to hardware counters which would be your best best to measure the FLOPS. PAPI_FLOPS.
来源:https://stackoverflow.com/questions/5330717/counting-flops-for-a-code