What is O value for naive random selection from finite set?

前端 未结 8 1857
终归单人心
终归单人心 2021-02-05 17:25

This question on getting random values from a finite set got me thinking...

It\'s fairly common for people to want to retrieve X unique values from a set of Y values.

8条回答
  •  [愿得一人]
    2021-02-05 18:13

    If you're willing to make the assumption that your random number generator will always find a unique value before cycling back to a previously seen value for a given draw, this algorithm is O(m^2), where m is the number of unique values you are drawing.

    So, if you are drawing m values from a set of n values, the 1st value will require you to draw at most 1 to get a unique value. The 2nd requires at most 2 (you see the 1st value, then a unique value), the 3rd 3, ... the mth m. Hence in total you require 1 + 2 + 3 + ... + m = [m*(m+1)]/2 = (m^2 + m)/2 draws. This is O(m^2).

    Without this assumption, I'm not sure how you can even guarantee the algorithm will complete. It's quite possible (especially with a pseudo-random number generator which may have a cycle), that you will keep seeing the same values over and over and never get to another unique value.

    ==EDIT==

    For the average case:

    On your first draw, you will make exactly 1 draw. On your 2nd draw, you expect to make 1 (the successful draw) + 1/n (the "partial" draw which represents your chance of drawing a repeat) On your 3rd draw, you expect to make 1 (the successful draw) + 2/n (the "partial" draw...) ... On your mth draw, you expect to make 1 + (m-1)/n draws.

    Thus, you will make 1 + (1 + 1/n) + (1 + 2/n) + ... + (1 + (m-1)/n) draws altogether in the average case.

    This equals the sum from i=0 to (m-1) of [1 + i/n]. Let's denote that sum(1 + i/n, i, 0, m-1).

    Then:

    sum(1 + i/n, i, 0, m-1) = sum(1, i, 0, m-1) + sum(i/n, i, 0, m-1)
                            = m + sum(i/n, i, 0, m-1)
                            = m + (1/n) * sum(i, i, 0, m-1)
                            = m + (1/n)*[(m-1)*m]/2
                            = (m^2)/(2n) - (m)/(2n) + m 
    

    We drop the low order terms and the constants, and we get that this is O(m^2/n), where m is the number to be drawn and n is the size of the list.

提交回复
热议问题