问题
I have a pixel shader:
varying vec2 f_texcoord;
uniform vec4 mycolor_mult;
uniform sampler2D mytexture;
void main(void) {
gl_FragColor = (texture2D(mytexture, f_texcoord) * mycolor_mult);
};
and corresponding C++ code:
GLint m_attr = glGetUniformLocation(m_program, "mycolor_mult");
// ...
unsigned int myColor = ...; // 0xAARRGGBB format
float a = (myColor >> 24) / 255.f;
float r = ((myColor >> 16) & 0xFF) / 255.f;
float g = ((myColor >> 8) & 0xFF) / 255.f;
float b = (myColor & 0xFF) / 255.f;
glUniform4f(m_attr, r, g, b, a);
I keep sprite's color as unsigned int
and have to convert it to 4 floats to pass them to the shader.
Can it be optimized? I mean can I pass not floats, but unsigned chars as components to the shader and avoid "divide by 255" operations? What should I change in shader and in C++ code to do it?
回答1:
With modern OpenGL (GLSL >= 4.1), there is the unpackUnorm4x8 GLSL function which does exactly what you want: it takes a single 32 bit uint and creates a normalized floating point vector out of it. You just have to swizzle the result to match your byte order, that function will interpret the least significant byte as the first channel.
uniform uint mycolor_packed;
//...
vec4 mycolor_mult=unpackUnorm4x8(mycolor_packed).bgra;
This is potentially the most efficient way to do the conversion in the shader itself. However, it still remains doubtful if doing this once per fragment on the GPU is more efficient vs. only once per draw call on the CPU.
回答2:
There are a few aspects to this question.
Is it worth optimizing?
I agree with @Nick's comment. There's a high likelihood that you're trying to optimize something that is not performance critical at all. For example, if this code is only executed once per frame, the execution time of this code is absolutely insignificant. If this is executed many times per frame, things could look a bit different. Using a profiler can tell you how much time is spent in this code.
Are you optimizing the right thing?
Make sure that the glGetUniformLocation()
call is only done once after linking the shader, not each time you set the uniform. Otherwise, that call will most likely be much more expensive than the rest of the code. It's not entirely clear from the code if you're already doing that.
Can you use more efficient OpenGL calls?
Not really, if you need the values as floats in the shader. There are no automatic format conversions for uniforms, so you cannot simply use a different call from the glUniform*()
family. From the spec:
For all other uniform types the Uniform* command used must match the size and type of the uniform, as declared in the shader. No type conversions are done.
Can the code be optimized?
If you really want to do micro-optimizations, you can replace the divisions by multiplications. Divisions are much more expensive than multiplications on most CPUs. The code then looks like this:
const float COLOR_SCALE = 1.0f / 255.f;
float a = (myColor >> 24) * COLOR_SCALE;
float r = ((myColor >> 16) & 0xFF) * COLOR_SCALE;
float g = ((myColor >> 8) & 0xFF) * COLOR_SCALE;
float b = (myColor & 0xFF) * COLOR_SCALE;
You can't count on the compiler to perform this transformation for you, since changing operations can have effects of the precision/rounding of the operation. Some compilers have flags to enable these kinds of optimizations. See for example Optimizing a floating point division and conversion operation.
来源:https://stackoverflow.com/questions/34515950/how-to-avoid-int-float-conversion-when-passing-data-to-pixel-shader