I\'m working on a project which is in serious need of some performance tuning.
How do I write a test that fails if my optimizations do not in improve the speed of the pr
For the tuning itself, you can compare the old code and new code directly. But don't keep both copies around. This sounds like a nightmare to manage. Also, you're only ever comparing one version with another version. It's possible that a change in functionality will slow down your function, and that is acceptable to the users.
Personally, I've never seen performance criteria of the type 'must be faster than the last version', because it is so hard to measure.
You say 'in serious need of performance tuning'. Where? Which queries? Which functions? Who says, the business, the users? What is acceptable performance? 3 seconds? 2 seconds? 50 milliseconds?
The starting point for any performance analysis is to define the pass/fail criteria. Once you have this, you CAN automate the performance tests.
For reliability, you can use a (simple) statistical approach. For example, run the same query under the same conditions 100 times. If 95% of them return in under n seconds, that is a pass.
Personally, I would do this at integration time, from either a standard machine, or the integration server itself. Record the values for each test somewhere (cruise control has some nice features for this sort of thing). If you do this, you can see how performance progresses over time, and with each build. You can even make a graph. Managers like graphs.
Having a stable environment is always hard to do when doing performance testing, whether or not you're doing automated tests or not. You'll have that particular problem no matter how you develop (TDD, Waterfall, etc).