Sorry, it\'s a long one, but I\'m just explaining my train of thought as I analyze this. Questions at the end.
I have an understanding of what goes into measuring runni
I think your first code sample seems like the best approach.
Your first code sample is small, clean and simple and doesn't use any major abstractions during the test loop which may introduce hidden overhead.
Use of the Stopwatch class is a good thing as it simplifies the code one normally has to write to get high-resolution timings.
One thing you might consider is providing the option to iterate the test for a smaller number of times untimed before entering the timing loop to warm up any caches, buffers, connections, handles, sockets, threadpool threads etc. that the test routine may exercise.
HTH.