what is the most efficient way of moving multiple objects (stored in VBO) in space? should I use glTranslatef or a shader?

a 夏天 提交于 2019-11-30 14:05:29

Here's my thoughts to the revised question:

1) should I use glMapBuffer to bind the buffer on the GPU and fill the data directly (instead of using glBufferSubData)? Or will this make no difference performance wise?

I'm not aware that there is any significant performance between the two, though I would probably prefer glBufferSubData.

What I might suggest in your case is to create a VBO with N floats, and then use it similar to a circular buffer. Keep an index locally to where the 'end' of the buffer is, then every update replace the value under 'end' with the new value, and increment the pointer. This way you only have to update a single float each cycle.

Having done that, you can draw this buffer using 2x translates and 2x glDrawArrays/Elements:

Imagine that you've got an array of 10 elements, and the buffer end pointer is at element 4. Your array will contain the following 10 values, where x is a constant value, and f(n-d) is the random sample from d cycles ago:

0: (0, f(n-4) )
1: (1, f(n-3) )
2: (2, f(n-2) )
3: (3, f(n-1) )  
4: (4, f(n)   )  <-- end of buffer 
5: (5, f(n-9) )  <-- start of buffer
6: (6, f(n-8) )
7: (7, f(n-7) )
8: (8, f(n-6) )
9: (9, f(n-5) )

To draw this (pseudo-guess code, might not be exactly correct):

glTranslatef( -end, 0, 0);
glDrawArrays( LINE_STRIP, end+1, (10-end)); //draw elems 5-9 shifted left by 4
glPopMatrix();
glTranslatef( end+1, 0, 0);
glDrawArrays(LINE_STRIP, 0, end); // draw elems 0-4 shifted right by 5 

Then in the next cycle, replace the oldest value with the new random value,and shift the circular buffer pointer forward.

2) should I use a shader for moving objects (here line strip) instead of calling glTranslatef? If so, how would such a shader look like? (I suspect that a shader is the wrong way to go, since my line strip is NOT a period function but rather contains random data).

Probably optional, if you use the method that I've described in #1. There's not a particular advantage to using one here.

3) what happens if the window get's resized? how do I keep aspect ratio and scale vertices accordingly? glViewport() only helps scaling in y direction, not in x direction. If the window is rescaled in x-direction, then in my current implementation I would have to recalculate the position of the entire line strip (calling my_func to get the new x coordinates) and upload it to the GPU. I guess this could be done more elegantly? How would I do that?

You shouldn't have to recalculate any data. Just define all your data in some fixed coordinate system that makes sense to you, and then use projection matrix to map this range to the window. Without more specifics its hard to answer.

4) I noticed that when I use glTranslatef with a non integral value, the screen starts to flicker if the line strip consists of thousands of points. This is most probably because the fine resolution that I use to calculate the line strip does not match the pixel resolution of the screen and therefore sometimes some points appear in front and sometimes behind other points (this is particularly annoying when you don't render a sine wave but some 'random' data). How can I prevent this from happening (besides the obvious solution of translating by a integer multiple of 1 pixel)? If a window get re-sized from let's say originally 800x800 pixels to 100x100 pixels and I still want to visualize a line strip of 20 seconds, then shifting in x direction must work flicker free somehow with sub pixel precision, right?

Your assumption seems correct. I think the thing to do here would either to enable some kind of antialiasing (you can read other posts for how to do that), or make the lines wider.

There are a number of things that could be at work here.

  • glBindBuffer is one of the slowest OpenGL operations (along with similar call for shaders, textures, etc.)
  • glTranslate adjusts the modelview matrix, which the vertex unit multiplies all points by. So, it simply changes what matrix you multiply by. If you were to instead use a vertex shader, then you'd have to translate it for each vertex individually. In short: glTranslate is faster. In practice, this shouldn't matter too much, though.
  • If you're recalculating the sine function on a lot of points every time you draw, you're going to have performance issues (especially since, by looking at your source, it looks like you might be using Python).
  • You're updating your VBO every time you draw it, so it's not any faster than a vertex array. Vertex arrays are faster than intermediate mode (glVertex, etc.) but nowhere near as fast as display lists or static VBOs.
  • There could be coding errors or redundant calls somewhere.

My verdict:

You're calculating a sine wave and an offset on the CPU. I strongly suspect that most of your overhead comes from calculating and uploading different data every time you draw it. This is coupled with unnecessary OpenGL calls and possibly unnecessary local calls.

My recommendation:

This is an opportunity for the GPU to shine. Calculating function values on parallel data is (literally) what the GPU does best.

I suggest you make a display list representing your function, but set all the y-coordinates to 0 (so it's a series of points all along the line y=0). Then, draw this exact same display list once for every sine wave you want to draw. Ordinarily, this would just produce a flat graph, but, you write a vertex shader that transforms the points vertically into your sine wave. The shader takes a uniform for the sine wave's offset ("sin(x-offset)"), and just changes each vertex's y.

I estimate this will make your code at least ten times faster. Furthermore, because the vertices' x coordinates are all at integral points (the shader does the "translation" in the function's space by computing "sin(x-offset)"), you won't experience jittering when offsetting with floating point values.

You've got a lot here, so I'll cover what I can. Hopefully this will give you some areas to research.

1) should I use glMapBuffer to bind the buffer on the GPU and fill the data directly (instead of using glBufferSubData)? Or will this make no difference performance wise?

I would expect glBufferSubData to have better performance. If the data is stored on the GPU then mapping it will either

  • Copy the data back into host memory so you can modify it, and the copy it back when you unmap it.
  • or, give you a pointer to the GPU's memory directly which the CPU will access over PCI-Express. This isn't anywhere near as slow as it used to be to access GPU memory when we were on AGP or PCI, but it's still slower and not as well cached, etc, as host memory.

glSubBufferData will send the update of the buffer to the GPU and it will modify the buffer. No copying the back and fore. All data transferred in one burst. It should be able to do it as an asynchronous update of the buffer as well.

Once you get into "is this faster than that?" type comparisons you need to start measuring how long things take. A simple frame timer is normally sufficient (but report time per frame, not frames per second - it makes numbers easier to compare). If you go finer-grained than that, just be aware that because of the asynchronous nature of OpenGL, you often see time being consumed away from the call that caused the work. This is because after you give the GPU a load of work, it's only when you have to wait for it to finish something that you notice how long it's taking. That normally only happens when you're waiting for front/back buffers to swap.

2) should I use a shader for moving objects (here line strip) instead of calling glTranslatef? If so, how would such a shader look like?

No difference. glTranslate modifies a matrix (normally the Model-View) which is then applied to all vertices. If you have a shader you'd apply a translation matrix to all your vertices. In fact the driver is probably building a small shader for you already.

Be aware that the older APIs like glTranslate() are depreciated from OpenGL 3.0 onwards, and in modern OpenGL everything is done with shaders.

3) what happens if the window get's resized? how do I keep aspect ratio and scale vertices accordingly? glViewport() only helps scaling in y direction, not in x direction.

glViewport() sets the size and shape of the screen area that is rendered to. Quite often it's called on window resizing to set the viewport to the size and shape of the window. Doing just this will cause any image rendered by OpenGL to change aspect ratio with the window. To keep things looking the same you also have to control the projection matrix to counteract the effect of changing the viewport.

Something along the lines of:

glViewport(0,0, width, height);
glMatrixMode(GL_PROJECTION_MATRIX);
glLoadIdentity();
glScale2f(1.0f, width / height); // Keeps X scale the same, but scales Y to compensate for aspect ratio

That's written from memory, and I might not have the maths right, but hopefully you get the idea.

4) I noticed that when I use glTranslatef with a non integral value, the screen starts to flicker if the line strip consists of thousands of points.

I think you're seeing a form of aliasing which is due to the lines moving under the sampling grid of the pixels. There are various anti-aliasing techniques you can use to reduce the problem. OpenGL has anti-aliased lines (glEnable(GL_SMOOTH_LINE)), but a lot of consumer cards didn't support it, or only did it in software. You can try it, but you may get no effect or run very slowly.

Alternatively you can look into Multi-sample anti-aliasing (MSAA), or other types that your card may support through extensions.

Another option is rendering to a high resolution texture (via Frame Buffer Objects - FBOs) and then filtering it down when you render it to the screen as a textured quad. This would also allow you to do a trick where you move the rendered texture slightly to the left each time, and rendered the new strip on the right each frame.

1    1
 1  1 1  Frame 1
  11

    1 
1  1 1   Frame 1 is copied left, and a new line segment is added to make frame 2
 11   2

   1
  1 1 3  Frame 2 is copied left, and a new line segment is added to make frame 3
11   2

It's not a simple change, but it might help you out with your problem (5).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!