I\'d like to estimate the big-oh performance of some methods in a library through benchmarks. I don\'t need precision -- it suffices to show that someth
If you are happy to estimate this empirically, you can measure how long it takes to do exponentially increasing numbers of operations. Using the ratio you can get which function you estimate it to be.
e.g. if the ratio of 1000 operations to 10000 operations (10x) is (test the longer one first) You need to do a realistic number of operations to see what the order is for the range you have.
Its is just an estimate as time complexity is intended for an ideal machine and something should can be mathematically proven rather than measures.
e.g. Many people tried to prove empirically that PI is a fraction. When they measured the ratio of circumference to diameter for circles they had made it was always a fraction. Eventually, it was generally accepted that PI is not a fraction.
I actually know beforehand the big-oh of most of the methods that will be tested. My main intention is to provide performance regression testing for them.
This requirement is key. You want to detect outliers with minimal data (because testing should be fast, dammit), and in my experience fitting curves to numerical evaluations of complex recurrences, linear regression and the like will overfit. I think your initial idea is a good one.
What I would do to implement it is prepare a list of expected complexity functions g1, g2, ..., and for data f, test how close to constant f/gi + gi/f is for each i. With a least squares cost function, this is just computing the variance of that quantity for each i and reporting the smallest. Eyeball the variances at the end and manually inspect unusually poor fits.
For an empiric analysis of the complexity of the program, what you would do is run (and time) the algorithm given 10, 50, 100, 500, 1000, etc input elements. You can then graph the results and determine the best-fit function order from the most common basic types: constant, logarithmic, linear, nlogn, quadratic, cubic, higher-polynomial, exponential. This is a normal part of load testing, which makes sure that the algorithm is first behaving as theorized, and second that it meets real-world performance expectations despite its theoretical complexity (a logarithmic-time algorithm in which each step takes 5 minutes is going to lose all but the absolute highest-cardinality tests to a quadratic-complexity algorithm in which each step is a few millis).
EDIT: Breaking it down, the algorithm is very simple:
Define a list, N, of various cardinalities for which you want to evaluate performance (10,100,1000,10000 etc)
For each element X in N:
Create a suitable set of test data that has X elements.
Start a stopwatch, or determine and store the current system time.
Run the algorithm over the X-element test set.
Stop the stopwatch, or determine the system time again.
The difference between start and stop times is your algorithm's run time over X elements.
Repeat for each X in N.
Plot the results; given X elements (x-axis), the algorithm takes T time (y-axis). The closest basic function governing the increase in T as X increases is your Big-Oh approximation. As was stated by Raphael, this approximation is exactly that, and will not get you very fine distinctions such as coefficients of N, that could make the difference between a N^2 algorithm and a 2N^2 algorithm (both are technically O(N^2) but given the same number of elements one will perform twice as fast).
Wanted to share my experiments as well. Nothing new from the theoretical standpoint, but it's a fully functional Python module that can easily be extended.
Main points:
It's based on scipy
Python library curve_fit
function that allows
to fit any function into the given set of points minimizing sum of
square differences;
Since tests are done increasing the problem size exponentially points closer to the start will kind of have a bigger weight, which does not help to identify the correct approximation, so it seems to me that simple linear interpolation to redestribute points evenly does help;
The set of approximations we are trying to fit is fully under our control; I've added the following ones:
def fn_linear(x, k, c):
return k * x + c
def fn_squared(x, k, c):
return k * x ** 2 + c
def fn_pow3(x, k, c):
return k * x ** 3 + c
def fn_log(x, k, c):
return k * np.log10(x) + c
def fn_nlogn(x, k, c):
return k * x * np.log10(x) + c
Here is a fully functional Python module to play with: https://gist.github.com/gubenkoved/d9876ccf3ceb935e81f45c8208931fa4, and some pictures it produces (please note -- 4 graphs per sample with different axis scales).