问题
I send a VertexBuffer+IndexBuffer of GL_TRIANGLES
via glDrawElements()
to the GPU.
In the vertex shader I wanted snap some vertices to the same coordinates to simplify a large mesh on-the-fly. As result I expeceted a major performance boost because a lot of triangle are collapsing to the same point and would be degenerated. But I don't get any fps gain.
Due testing I set my vertex shader just to gl_Position(vec4(0))
to degenerate ALL triangles, but still no difference...
Is there any flag to "activate" the degeneration or what am I'm missing?
glQuery
of GL_PRIMITIVES_GENERATED
also prints always the number of all mesh faces.
回答1:
What you're missing is how the optimization you're trying to use actually works.
The particular optimization you're talking about is post-caching of T&L. That is, if the same vertex is going to get processed twice, you only process it once and use the results twice.
What you don't understand is how "the same vertex" is actually determined. It isn't determined by anything your vertex shader could compute. Why? Well, the whole point of caching is to avoid running the vertex shader. If the vertex shader was used to determine if the value was already cached... you've saved nothing since you had to recompute it to determine that.
"The same vertex" is actually determined by matching the vertex index and vertex instance. Each vertex in the vertex array has a unique index associated with it. If you use the same index twice (only possible with indexed rendering of course), then the vertex shader would receive the same input data. And therefore, it would produce the same output data. So you can use the cached output data.
Instance ids also play into this, since when doing instanced rendering, the same vertex index does not necessarily mean the same inputs to the VS. But even then, if you get the same vertex index and the same instance id, then you would get the same VS inputs, and therefore the same VS outputs. So within an instance, the same vertex index represents the same value.
Both the instance count and the vertex indices are part of the rendering process. They don't come from anything the vertex shader can compute. The vertex shader could generate the same positions, normals, or anything else, but the actual post-transform cache is based on the vertex index and instance.
So if you want to "snap some vertices to the same coordinates to simplify a large mesh", you have to do that before your rendering command. If you want to do it "on the fly" in a shader, then you're going to need some kind of compute shader or geometry shader/transform feedback process that will compute the new mesh. Then you need to render this new mesh.
You can discard a primitive in a geometry shader. But you still had to do T&L on it. Plus, using a GS at all slows things down, so I highly doubt you'll gain much performance by doing this.
来源:https://stackoverflow.com/questions/34422774/opengl-degenerate-gl-triangles-sharing-same-vertices