Speed of cos() and sin() function in GLSL shaders?

前端 未结 6 1880
别那么骄傲
别那么骄傲 2021-01-17 10:51

I\'m interested in information about the speed of sin() and cos() in Open GL Shader Language.

The GLSL Specification Document indi

相关标签:
6条回答
  • 2021-01-17 11:31

    This is a good question. I too wondered this.

    Google'd links say cos and sin are single-cycle on mainstream cards since 2005 or so.

    0 讨论(0)
  • 2021-01-17 11:33

    You'd have to test this out yourself, but I'm pretty sure that branching in a shader is far more expensive than a sin or cos calculation. GLSL compilers are pretty good about optimizing shaders, worrying about this is premature optimization. If you later find that, through your entire program, your shaders are the bottleneck, then you can worry about optimizing this.

    If you want to take a look at the assembly code of your shader for a specific platform, I would recommend AMD GPU ShaderAnalyzer.

    0 讨论(0)
  • 2021-01-17 11:33

    Not sure if this answers your question, but it's very difficult to tell you how many clocks/slots an instruction takes as it depends very much on the GPU. Usually it's a single cycle. But even if not, the compiler may rearrange the order of instruction execution to hide the true cost. It's certainly slower to use texture lookups for sin/cos as it is to execute the instructions.

    0 讨论(0)
  • 2021-01-17 11:33

    The compiler evaluates both branches, which makes conditions quite expensive. If you use both sin and cos in your shader, you can calculate only sin(a) and cos(a) = sqrt(1.0 - sin(a)) since sin(x)*sin(x) + cos(x)*cos(x) is always 1.0

    0 讨论(0)
  • 2021-01-17 11:45

    see how many sin's you can get in one shader in a row, compared to math.abs,frac, ect... i think a gtx 470 can handle 200 sin functions per fragment no probs, the frame will be 10 percent slower than an empty shader. it's farly fast, you can send results in. it will be a good indicator of computational efficiency.

    0 讨论(0)
  • 2021-01-17 11:46

    For example, in my application it'll be very common for the argument to be 0. So does something like this make sense:

    No.

    Your compiler will do one of two things.

    1. It will issue an actual conditional branch. In the best possible case, if 0 is a value that is coherent locally (such that groups of shaders will often hit 0 or non-zero together), then you might get improved performance.
    2. It will evaluate both sides of the condition, and only store the result for the correct one of them. In which case, you've gained nothing.

    In general, it's not a good idea to use conditional logic to dance around small performance like this. It needs to be really big to be worthwhile, like a discard or something.

    Also, do note that floating-point equivalence is not likely to work. Not unless you actually pass a uniform or vertex attribute containing exactly 0.0 to the shader. Even interpolating between 0 and non-zero will likely never produce exactly 0 for any fragment.

    0 讨论(0)
提交回复
热议问题