Best practices for measuring the run-time complexity of a piece of code

前端未结

关注

 5  1612

野趣味 2021-02-07 09:43

I have a gnarly piece of code whose time-efficiency I would like to measure. Since estimating this complexity from the code itself is hard, I want to place it in a loop and time

5条回答

心在旅途 (楼主)

2021-02-07 10:27
I'm not aware of any software for this, or previous work done on it. And, fundamentally, I don't think you can get answers of the form "O(whatever)" that are trustworthy. Your measurements are noisy, you might be trying to distinguish nlog(n) operations from nsqrt(n) operations, and unlike a nice clean mathematical analysis, all of the dropped constants are still floating around messing with you.

That said, the process I would go through if I wanted to come up with a best estimate:
1. Making sure I record as much information as possible through the whole process, I'd run the thing I want to measure on as many inputs (and sizes) as I could before I got bored. Probably overnight. Repeated measurements for each input and size.
2. Shovel the input size to time data into a trial copy of Eureqa and see what pops out.
3. If I'm not satisfied, get more data, continue to shovel it into Eureqa and see if the situation is improving.
4. Assuming Eureqa doesn't give an answer I like before I get bored of it consuming all of my CPU time and power, I'd switch over to Bayesian methods.
5. Using something like pymc I'd attempt to model the data using a bunch of likely looking complexity functions. (n, n^2, n^3, n^3, n*log(n), n^2*log(n) n^2*log(n)^2, etc, etc, etc).
6. Compare the DIC (smaller is better) of each model, looking for the best few.
7. Plot the best few, look for spots where data and model disagree.
8. Collect more data near disagreements. Recompute the models.
9. Repeat 5-8 until bored.
10. Finally, collect some new data points at larger input sizes, see which model(s) best predict those data points.
11. Choose to believe that one of them is true enough.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...