Benchmarking programs in Rust

前端未结

关注

 8  1198

礼貌的吻别

How is it possible to benchmark programs in Rust? For example, how would I get execution time of program in seconds?

相关标签:

8条回答

轻奢々

2020-12-08 00:28
It might be worth noting 2 years later (to help any future Rust programmers who stumble on this page) that there are now tools to benchmark Rust code as a part of one's test suite.

(From the guide link below) Using the #[bench] attribute, one can use the standard Rust tooling to benchmark methods in their code.
```
extern crate test;
use test::Bencher;

#[bench]
fn bench_xor_1000_ints(b: &mut Bencher) {
    b.iter(|| {
        // use `test::black_box` to prevent compiler optimizations from disregarding
        // unused values
        test::black_box(range(0u, 1000).fold(0, |old, new| old ^ new));
    });
}
```
For the command cargo bench this outputs something like:
```
running 1 test
test bench_xor_1000_ints ... bench:       375 ns/iter (+/- 148)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
```
Links:
- The Rust Book (section on benchmark tests)
- "The Nightly Book" (section on the test crate)
- test::Bencher docs
0 讨论(0)
发布评论:

提交评论
- 加载中...
臣服心动

2020-12-08 00:28
Currently, there is no interface to any of the following Linux functions:
- clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts)
- getrusage
- times (manpage: man 2 times)
The available ways to measure the CPU time and hotspots of a Rust program on Linux are:
- /usr/bin/time program
- perf stat program
- perf record --freq 100000 program; perf report
- valgrind --tool=callgrind program; kcachegrind callgrind.out.*
The output of perf report and valgrind depends on the availability of debugging information in the program. It may not work.
0 讨论(0)
发布评论:

提交评论
- 加载中...
遇见更好的自我

2020-12-08 00:33
There are several ways to benchmark your Rust program. For most real benchmarks, you should use a proper benchmarking framework as they help with a couple of things that are easy to screw up (including statistical analysis). Please also read the "Why writing benchmarks is hard" section at the very bottom!

Quick and easy: Instant and Duration from the standard library

To quickly check how long a piece of code runs, you can use the types in std::time. The module is fairly minimal, but it is fine for simple time measurements. You should use Instant instead of SystemTime as the former is a monotonically increasing clock and the latter is not. Example (Playground):
```
use std::time::Instant;

let before = Instant::now();
workload();
println!("Elapsed time: {:.2?}", before.elapsed());
```
The precision of std's Instant is unfortunately not specified in the documentation, but on all major operating systems it uses the best precision that the platform can provide (this is typically approximately around 20ns).

If std::time does not offer enough features for your case, you could take a look at chrono. However, for measuring durations, it's unlikely you need that external crate.

Using a benchmarking framework

Using frameworks is often a good idea, because they try to prevent you from making common mistakes.

Rust's built-in benchmarking framework (nightly only)

Rust has a convenient built-in benchmarking feature, which is unfortunately still unstable as of 2019-07. You have to add the #[bench] attribute to your function and make it accept one &mut test::Bencher argument:
```
#![feature(test)]

extern crate test;
use test::Bencher;

#[bench]
fn bench_workload(b: &mut Bencher) {
    b.iter(|| workload());
}
```
Executing cargo bench will print:
```
running 1 test
test bench_workload ... bench:      78,534 ns/iter (+/- 3,606)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured; 0 filtered out
```
Criterion

The crate criterion is a framework that runs on stable, but it is a bit more complicated than the built-in solution. It does more sophisticated statistical analysis, offers a richer API, produces more information and can even automatically generate plots.

See the "Quickstart" section for more information on how to use Criterion.

Why writing benchmarks is hard

There are many pitfalls when writing benchmarks. A single mistake can make your benchmark results meaningless. Here is a list of important but commonly forgotten points:
- Compile with optimizations: rustc -O3 or cargo build --release. When you are executing your benchmarks with cargo bench, Cargo will automatically enabled optimizations. This step is important as there are often large performance difference between optimized and unoptimized Rust code.
- Repeat the workload: only running your workload once is almost always useless. There are many things that can influence your timing: overall system load, the operating system doing stuff, CPU throttling, file system caches, and so on. So repeat your workload as often as possible. For example, Criterion runs every benchmarks for at least 5 seconds (even if the workload only takes a few nanoseconds). All measured times can then be analyzed, with mean and standard deviation being the standard tools.
- Make sure your benchmark isn't completely removed: benchmarks are very artificial by nature. Usually, the result of your workload is not inspected as you only want to measure the duration. However, this means that a good optimizer could remove your whole benchmark because it does not have side-effects (well, apart from the passage of time). So to trick the optimizer you have to somehow use your result value so that your workload cannot be removed. An easy way is to print the result. A better solution is something like black_box. This function basically hides a value from LLVM in that LLVM cannot know what will happen with the value. Nothing happens, but LLVM doesn't know. That is the point.
  
  Good benchmarking frameworks use a block box in several situations. For example, the closure given to the iter method (for both, the built-in and Criterion Bencher) can return a value. That value is automatically passed into a black_box.
- Beware of constant values: similarly to the point above, if you specify constant values in a benchmark, the optimizer might generate code specifically for that value. In extreme cases, your whole workload could be constant-folded into a single constant, meaning that your benchmark is useless. Pass all constant values through black_box to avoid LLVM optimizing too aggressively.
- Beware of measurement overhead: measuring a duration takes time itself. That is usually only tens of nanoseconds, but can influence your measured times. So for all workloads that are faster than a few tens of nanoseconds, you should not measure each execution time individually. You could execute your workload 100 times and measure how long all 100 executions took. Dividing that by 100 gives you the average single time. The benchmarking frameworks mentioned above also use this trick. Criterion also has a few methods for measuring very short workloads that have side effects (like mutating something).
- Many other things: sadly, I cannot list all difficulties here. If you want to write serious benchmarks, please read more online resources.
0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2020-12-08 00:36
I created a small crate for this (measure_time), which logs or prints the time until end of scope.
```
#[macro_use]
extern crate measure_time;
fn main() {
    print_time!("measure function");
    do_stuff();
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

梦谈多话

2020-12-08 00:41

For measuring time without adding 3rd party dependencies you can use std::time::Instant

fn main() {
    use std::time::Instant;
    let now = Instant::now();

    {
        my_function_to_measure();
    }

    let elapsed = now.elapsed();
    println!("Elapsed: {:.2}", elapsed);
}

0 讨论(0)

名媛妹妹

2020-12-08 00:44
A quick way to find out the execution time of a program, regardless of implementation language, is to run time prog on the command line. For example:
```
~$ time sleep 4

real    0m4.002s
user    0m0.000s
sys     0m0.000s
```
The most interesting measurement is usually user, which measures the actual amount of work done by the program, regardless of what's going on in the system (sleep is a pretty boring program to benchmark). real measures the actual time that elapsed, and sys measures the amount of work done by the OS on behalf of the program.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页

Benchmarking programs in Rust

Quick and easy: Instant and Duration from the standard library

Using a benchmarking framework

Rust's built-in benchmarking framework (nightly only)

Criterion

Why writing benchmarks is hard

Quick and easy: `Instant` and `Duration` from the standard library