A weighted version of random.choice

后端 未结 25 1925
闹比i
闹比i 2020-11-21 06:29

I needed to write a weighted version of random.choice (each element in the list has a different probability for being selected). This is what I came up with:



        
25条回答
  •  花落未央
    2020-11-21 07:01

    It depends on how many times you want to sample the distribution.

    Suppose you want to sample the distribution K times. Then, the time complexity using np.random.choice() each time is O(K(n + log(n))) when n is the number of items in the distribution.

    In my case, I needed to sample the same distribution multiple times of the order of 10^3 where n is of the order of 10^6. I used the below code, which precomputes the cumulative distribution and samples it in O(log(n)). Overall time complexity is O(n+K*log(n)).

    import numpy as np
    
    n,k = 10**6,10**3
    
    # Create dummy distribution
    a = np.array([i+1 for i in range(n)])
    p = np.array([1.0/n]*n)
    
    cfd = p.cumsum()
    for _ in range(k):
        x = np.random.uniform()
        idx = cfd.searchsorted(x, side='right')
        sampled_element = a[idx]
    

提交回复
热议问题