This question on getting random values from a finite set got me thinking...
It\'s fairly common for people to want to retrieve X unique values from a set of Y values.
The worst case for this algorithm is clearly when you're choosing the full set of N items. This is equivalent to asking: On average, how many times must I roll an N-sided die before each side has come up at least once?
Answer: N * HN, where HN is the Nth harmonic number,
a value famously approximated by log(N)
.
This means the algorithm in question is N log N
.
As a fun example, if you roll an ordinary 6-sided die until you see one of each number, it will take on average 6 H6 = 14.7 rolls.