Select random element from a set, faster than linear time (Haskell)

前端 未结 8 1575
清酒与你
清酒与你 2021-01-17 11:38

I\'d like to create this function, which selects a random element from a Set:

randElem :: (RandomGen g) => Set a -> g -> (a, g)

Si

相关标签:
8条回答
  • 2021-01-17 12:11

    Data.Map has an indexing function (elemAt), so use this:

    import qualified Data.Map as M
    import Data.Map(member, size, empty)
    import System.Random
    
    type Set a = M.Map a ()
    
    insert :: (Ord a) => a -> Set a -> Set a
    insert a = M.insert a ()
    
    fromList :: Ord a => [a] -> Set a
    fromList = M.fromList . flip zip (repeat ())
    
    elemAt i = fst . M.elemAt i
    
    randElem :: (RandomGen g) => Set a -> g -> (a, g)
    randElem s g = (elemAt n s, g')
        where (n, g') = randomR (0, size s - 1) g
    

    And you have something quite compatible with Data.Set (with respect to interface and performance) that also has a log(n) indexing function and the randElem function you requested.

    Note that randElem is log(n) (and it's probably the fastest implementation you can get with this complexity), and all the other functions have the same complexity as in Data.Set. Let me know if you need any other specific functions from the Set API and I will add them.

    0 讨论(0)
  • 2021-01-17 12:11

    If you had access to the internals of Data.Set, which is just a binary tree, you could recurse over the tree, at each node selecting one of the branches with probability according to their respective sizes. This is quite straight forward and gives you very good performance in terms of memory management and allocations, as you have no extra book-keeping to do. OTOH, you have to invoke the RNG O(log n) times.

    A variant is using Jonas’ suggestion to first take the size and select the index of the random element based on that, and then use a (yet to be added elemAt) function to Data.Set.

    0 讨论(0)
提交回复
热议问题