This question on getting random values from a finite set got me thinking...
It\'s fairly common for people to want to retrieve X unique values from a set of Y values.
Most people forget that looking up, if the number has already run, also takes a while.
The number of tries nessesary can, as descriped earlier, be evaluated from:
T(n,m) = n(H(n)-H(n-m)) ⪅ n(ln(n)-ln(n-m))
which goes to n*ln(n)
for interesting values of m
However, for each of these 'tries' you will have to do a lookup. This might be a simple O(n)
runthrough, or something like a binary tree. This will give you a total performance of n^2*ln(n)
or n*ln(n)^2
.
For smaller values of m
(m < n/2
), you can do a very good approximation for T(n,m)
using the HA
-inequation, yielding the formula:
2*m*n/(2*n-m+1)
As m
goes to n
, this gives a lower bound of O(n)
tries and performance O(n^2)
or O(n*ln(n))
.
All the results are however far better, that I would ever have expected, which shows that the algorithm might actually be just fine in many non critical cases, where you can accept occasional longer running times (when you are unlucky).