I am interested in timing the execution time of a free function or a member function (template or not). Call TheFunc the function in question, its call being
You can do it the MatLab way. It's very old-school but simple is often good:
tic();
a = f(c);
toc(); //print to stdout, or
auto elapsed = toc(); //store in variable
tic()
and toc()
can work to a global variable. If that's not sufficient, you can create local variables with some macro-magic:
tic(A);
a = f(c);
toc(A);
With variadic template, you may do:
template <typename F, typename ... Ts>
double Time_function(F&& f, Ts&&...args)
{
std::clock_t start = std::clock();
std::forward<F>(f)(std::forward<Ts>(args)...);
return static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
I'm a fan of using RAII wrappers for this type of stuff.
The following example is a little verbose but it's more flexible in that it works with arbitrary scopes instead of being limited to a single function call:
class timing_context {
public:
std::map<std::string, double> timings;
};
class timer {
public:
timer(timing_context& ctx, std::string name)
: ctx(ctx),
name(name),
start(std::clock()) {}
~timer() {
ctx.timings[name] = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
timing_context& ctx;
std::string name;
std::clock_t start;
};
timing_context ctx;
int main() {
timer_total(ctx, "total");
{
timer t(ctx, "foo");
// Do foo
}
{
timer t(ctx, "bar");
// Do bar
}
// Access ctx.timings
}
The downside is that you might end up with a lot of scopes that only serve to destroy the timing object.
This might or might not satisfy your requirements as your request was a little vague but it illustrates how using RAII semantics can make for some really nice reusable and clean code. It can probably be modified to look a lot better too!
I really like boost::cpu_timer::auto_cpu_timer, and when I cannot use boost I simply hack my own:
#include <cmath>
#include <string>
#include <chrono>
#include <iostream>
class AutoProfiler {
public:
AutoProfiler(std::string name)
: m_name(std::move(name)),
m_beg(std::chrono::high_resolution_clock::now()) { }
~AutoProfiler() {
auto end = std::chrono::high_resolution_clock::now();
auto dur = std::chrono::duration_cast<std::chrono::microseconds>(end - m_beg);
std::cout << m_name << " : " << dur.count() << " musec\n";
}
private:
std::string m_name;
std::chrono::time_point<std::chrono::high_resolution_clock> m_beg;
};
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
{
AutoProfiler p("N = 10");
foo(10);
}
{
AutoProfiler p("N = 1,000,000");
foo(1000000);
}
}
This timer works thanks to RAII. When you build the object within an scope you store the timepoint at that point in time. When you leave the scope (that is, at the corresponding }
) the timer first stores the timepoint, then calculates the number of ticks (which you can convert to a human-readable duration), and finally prints it to screen.
Of course, boost::timer::auto_cpu_timer
is much more elaborate than my simple implementation, but I often find my implementation more than sufficient for my purposes.
Sample run in my computer:
$ g++ -o example example.com -std=c++14 -Wall -Wextra
$ ./example
N = 10 : 0 musec
N = 1,000,000 : 10103 musec
I really liked the implementation suggested by @Jarod42. I modified it a little bit to offer some flexibility on the desired "units" of the output.
It defaults to returning the number of elapsed microseconds (an integer, normally std::size_t
), but you can request the output to be in any duration of your choice.
I think it is a more flexible approach than the one I suggested earlier because now I can do other stuff like taking the measurements and storing them in a container (as I do in the example).
Thanks to @Jarod42 for the inspiration.
#include <cmath>
#include <string>
#include <chrono>
#include <algorithm>
#include <iostream>
template<typename Duration = std::chrono::microseconds,
typename F,
typename ... Args>
typename Duration::rep profile(F&& fun, Args&&... args) {
const auto beg = std::chrono::high_resolution_clock::now();
std::forward<F>(fun)(std::forward<Args>(args)...);
const auto end = std::chrono::high_resolution_clock::now();
return std::chrono::duration_cast<Duration>(end - beg).count();
}
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
std::size_t N { 1000000 };
// profile in default mode (microseconds)
std::cout << "foo(1E6) takes " << profile(foo, N) << " microseconds" << std::endl;
// profile in custom mode (e.g, milliseconds)
std::cout << "foo(1E6) takes " << profile<std::chrono::milliseconds>(foo, N) << " milliseconds" << std::endl;
// To create an average of `M` runs we can create a vector to hold
// `M` values of the type used by the clock representation, fill
// them with the samples, and take the average
std::size_t M {100};
std::vector<typename std::chrono::milliseconds::rep> samples(M);
for(auto & sample : samples) {
sample = profile(foo, N);
}
auto avg = std::accumulate(samples.begin(), samples.end(), 0) / static_cast<long double>(M);
std::cout << "average of " << M << " runs: " << avg << " microseconds" << std::endl;
}
Output (compiled with g++ example.cpp -std=c++14 -Wall -Wextra -O3
):
foo(1E6) takes 10073 microseconds
foo(1E6) takes 10 milliseconds
average of 100 runs: 10068.6 microseconds