I have a friendly competition with couple of guys in the field of programming and recently we have become so interested in writing efficient code. Our challenge was to try t
Best for your purposes is valgrind/callgrind
There is the std::clock()
function from <ctime>
which returns how much CPU time was spent on the current process (that means it doesn't count the time the program was idling because the CPU was executing other tasks). This function can be used to accurately measure execution times of algorithms. Use the constant std::CLOCKS_PER_SEC
(also from <ctime>
) to convert the return value into seconds.
Here's a little C++11 stopwatch I like to roll out when I need to time something:
#include <chrono>
#include <ctime>
template <typename T> class basic_stopwatch
{
typedef T clock;
typename clock::time_point p;
typename clock::duration d;
public:
void tick() { p = clock::now(); }
void tock() { d += clock::now() - p; }
void reset() { d = clock::duration::zero(); }
template <typename S> unsigned long long int report() const
{
return std::chrono::duration_cast<S>(d).count();
}
unsigned long long int report_ms() const
{
return report<std::chrono::milliseconds>();
}
basic_stopwatch() : p(), d() { }
};
struct c_clock
{
typedef std::clock_t time_point;
typedef std::clock_t duration;
static time_point now() { return std::clock(); }
};
template <> unsigned long long int basic_stopwatch<c_clock>::report_ms() const
{
return 1000. * double(d) / double(CLOCKS_PER_SEC);
}
typedef basic_stopwatch<std::chrono::high_resolution_clock> stopwatch;
typedef basic_stopwatch<c_clock> cstopwatch;
Usage:
stopwatch sw;
sw.tick();
run_long_code();
sw.tock();
std::cout << "This took " << sw.report_ms() << "ms.\n";
On any decent implementation, the default high_resolution_clock
should give very accurate timing information.
It is quite difficult to calculate the detailing number of cpu time from a block of code. The normal way to do this is to design the worse / average / best input data as test cases. And do a timing profiling based on your real code with these test cases. There is no any tool can tell you the flops when it is without the detailing input test data and conditions.
There are pieces of software called profilers which exactly do what you want.
An example for Windows is AMD code analyser and gprof for POSIX.
Measuring the number of CPU instructions is pretty useless.
Performance is relative to bottleneck, depending on the problem at hand the bottleneck might be the network, disk IOs, memory or CPU.
For just a friendly competition, I would suggest timing. Which implies providing test cases that are big enough to have meaningful measures, of course.
On Unix, you can use gettimeofday
for relatively precise measures.