How to measure time performance of a compute shader?

问题

i need to measure the time of a compute shader. But of course this is not trivial. From OpenGL Wiki - Performance I got, that it is usefull to use glFinish() before and after the shader call. But they say also that it is not that good to use it. Is there a good possibility to measure the time of my shader? Is there anyways the possibility to measure the time of a compute shader?

My code looks like something like this:

renderloop()
{
  //(1)
  //(2)
  if(updateFunction) //this is done just one time at the beginning
  {
    //update Texture with a compute shader
    //...
    glDispatchCompute();
    glMemoryBarrier(GL_ALL_BARRIER_BITS);
  }
  //(3)
  //(1)

  //use the texture to do some marching cubes rendering
}

I guess I have to insert glFinish() at the positions (1) and to start the timer at (2) and to stop it at (3). But i am not sure if it really works and will produce correct timing results, because in the reference they were talking about rendering and a compute shader is no rendering, isn't it?

There exists the OpenGL Timer_Query too, but I am not sure how it works and don't know whether it is useful for me to use it or not. This stuff is new to me and I am not sure if I fully understand it at the moment.

The answer from here says that it is nearly impossible to measure precisly a part of the code. The best way is to measure the frame rendering time, but I just need the compute shader part of the frame rendering time for my purposes.

What do you think is the best alternative to do it? Just measure the whole frame rendering time and use it? Or did you make better experiences with other measurement methods?

回答1:

Timer queries are definitely the way to go.

The general principle is that you create 'query objects' that you insert in between your GL functions calls.

As GPUs run asynchronously, these queries will be inserted in the GPU command queue, and 'filled' when the commands are really processed.

So, in your case, you need to create a query using, say, glGenQueries(1, &myQuery);

Then, instead of starting a timer, you start the query in (2), using glBeginQuery(GL_TIME_ELAPSED, myQuery), and 'stop' it in (3), using glEndQuery(GL_TIME_ELAPSED).

To get the result, you can simply call glGetQueryObject function.

You can learn more here for example : http://www.lighthouse3d.com/tutorials/opengl-short-tutorials/opengl-timer-query/

Of course, there are some 'traps' - the main one is that you have to wait for the timing result to be ready - so you can either for the GPU & CPU to sync, which will slow down you application (but still give you good GL timings), or have multiple queries in-flight.

来源：https://stackoverflow.com/questions/28175631/how-to-measure-time-performance-of-a-compute-shader

标签

c++

performance

opengl

compute-shader