Best way to test code speed in C++ without profiler, or does it not make sense to try?

后端 未结 8 1994
忘了有多久
忘了有多久 2021-02-06 02:01

On SO, there are quite a few questions about performance profiling, but I don\'t seem to find the whole picture. There are quite a few issues involved and most Q & A ignore

相关标签:
8条回答
  • 2021-02-06 02:41

    Use QueryPerformanceCounter on Windows if you need a high-resolution timing. The counter accuracy depends on the CPU but it can go up to per clock pulse. However, profiling in real world operations is always a better idea.

    0 讨论(0)
  • 2021-02-06 02:43

    Obviously we would like to measure the cpu time of our code and not the real time, but as far as I understand, these functions don't give that functionality, so other processes on the system would interfere with measurements.

    I do two things, to ensure that wall-clock time and CPU time are approximately the same thing:

    • Test for a significant length of time, i.e. several seconds (e.g. by testing a loop of however many thousands of iterations)

    • Test when the machine is more or less relatively idle except for whatever I'm testing.

    Alternatively if you want to measure only/more exactly the CPU time per thread, that's available as a performance counter (see e.g. perfmon.exe).

    What can we know for certain without debugging, dissassembling and profiling tools?

    Nearly nothing (except that I/O tends to be relatively slow).

    0 讨论(0)
  • 2021-02-06 02:45

    Is there something you have against profilers? They help a ton. Since you are on WinXP, you should really give a trial of vtune a try. Try a call graph sampling test and look at self time and total time of the functions being called. There's no better way to tune your program so that it's the fastest possible without being an assembly genius (and a truly exceptional one).

    Some people just seem to be allergic to profilers. I used to be one of those and thought I knew best about where my hotspots were. I was often correct about obvious algorithmic inefficiencies, but practically always incorrect about more micro-optimization cases. Just rewriting a function without changing any of the logic (ex: reordering things, putting exceptional case code in a separate, non-inlined function, etc) can make functions a dozen times faster and even the best disassembly experts usually can't predict that without the profiler.

    As for relying on simplistic timing tests alone, they are extremely problematic. That current test is not so bad but it's a very common mistake to write timing tests in ways in which the optimizer will optimize out dead code and end up testing the time it takes to do essentially a nop or even nothing at all. You should have some knowledge to interpret the disassembly to make sure the compiler isn't doing this.

    Also timing tests like this have a tendency to bias the results significantly since a lot of them just involve running your code over and over in the same loop, which tends to simply test the effect of your code when all the memory in the cache with all the branch prediction working perfectly for it. It's often just showing you best case scenarios without showing you the average, real-world case.

    Depending on real world timing tests is a little bit better; something closer to what your application will be doing at a high level. It won't give you specifics about what is taking what amount of time, but that's precisely what the profiler is meant to do.

    0 讨论(0)
  • 2021-02-06 02:47

    Wha? How to measure speed without a profiler? The very act of measuring speed is profiling! The question amounts to, "how can I write my own profiler?" And the answer is clearly, "don't".

    Besides, you should be using std::swap in the first place, which complete invalidates this whole pointless pursuit.

    -1 for pointlessness.

    0 讨论(0)
  • 2021-02-06 02:59

    To answer you main question, it "reverse" algorithm just swaps elements from the array and not operating on the elements of the array.

    0 讨论(0)
  • 2021-02-06 03:00

    I would suppose that anyone competent enough to answer all your questions is gong to be far too busy to answer all your questions. In practice it is probably more effective to ask a single, well-defined questions. That way you may hope to get well-defined answers which you can collect and be on your way to wisdom.

    So, anyway, perhaps I can answer your question about which clock to use on Windows.

    clock() is not considered a high precision clock. If you look at the value of CLOCKS_PER_SEC you will see it has a resolution of 1 millisecond. This is only adequate if you are timing very long routines, or a loop with 10000's of iterations. As you point out, if you try and repeat a simple method 10000's of times in order to get a time that can be measured with clock() the compiler is liable to step in and optimize the whole thing away.

    So, really, the only clock to use is QueryPerformanceCounter()

    0 讨论(0)
提交回复
热议问题