Sorry, it\'s a long one, but I\'m just explaining my train of thought as I analyze this. Questions at the end.
I have an understanding of what goes into measuring runni
Regardless of the mechanism for timing your function (and the answers here seems fine) there is a very simple trick to eradicate the overhead of the benchmarking-code itself, i.e. the overhead of the loop, timer-readings, and method-call:
Simply call your benchmarking code with an empty Func
first, i.e.
void EmptyFunc() {}
This will give you a baseline of the timing-overhead, which you can essentially subtract from the latter measurements of your actual benchmarked function.
By "essentially" I mean that there are always room for variations when timing some code, due to garbage collection and thread and process scheduling. A pragmatic approach would e.g. be to benchmark the empty function, find the average overhead (total time divided by iterations) and then subtract that number from each timing-result of the real benchmarked function, but don't let it go below 0 which wouldn't make sense.
You will, of course, have to re-arrange your benchmarking code a bit. Ideally you'll want to use the exact same code to benchmark the empty function and real benchmarked function, so I suggest you move the timing-loop into another function or at least keep the two loops completely alike. In summary
By doing this the actual timing mechanism suddenly becomes a lot less important.