I want to test which data structure is the best by looking at the run-time performance of my algorithm, how do I do it?
For example I already have a hashmap<
First of all take a look at my reply to this question; it contains a portable (windows/linux) function to get the time in milliseconds.
Next, do something like this:
int64 start_time = GetTimeMs64();
const int NUM_TIMES = 100000; /* Choose this so it takes at the very least half a minute to run */
for (int i = 0; i < NUM_TIMES; ++i) {
/* Code you want to time.. */
}
double milliseconds = (GetTimeMs64() - start_time) / (double)NUM_TIMES;
All done! (Note that I haven't tried to compile it)
If you really want to decide between data structures, you need to devise a test that at least approximates how you will use the structure. Some structures perform better for small amounts of data, some perform better for large amounts of data. Some perform better when you do a lot of reads, some perform better when you are doing a lot of inserts/writes. Some perform better with data that is almost all the same, and some perform better with data that is very distinct. Some algorithms perform better if you only do them once, others perform better if you do them a million times, because they have better locality of code, making cache hits more likely. What I mean to say is, good luck :)
You might consider graphing the data and deriving its (mathematical) function. Run a few trials with, say, 10, 100, 1000, 10000, and 100000 iterations. Using the number of iterations as your x
variable and the resulting time as your y
variable, plot a graph. From this you can determine the function which describes the performance of the code, using linear regression (also know as curve-fitting in software) - I use
Graphical Analysis
for this.
Repeat the trials with other data-structures/code and do the same graphic procedure, then compare your graphs and functions.
You can also use this data to determine the effective Big O time-complexity for each data structure.
You would adjust the program to perform thousands or millions of hashmap lookups, ideally chosen randomly, and time that.