I am in need of a Python (2.7) object that functions like a set (fast insertion, deletion, and membership checking) but has the ability to return a random value. Previous qu
Here's a solution from scratch, which adds and pops in constant time. I also included some extra set functions for demonstrative purposes.
from random import randint
class RandomSet(object):
"""
Implements a set in which elements can be
added and drawn uniformly and randomly in
constant time.
"""
def __init__(self, seq=None):
self.dict = {}
self.list = []
if seq is not None:
for x in seq:
self.add(x)
def add(self, x):
if x not in self.dict:
self.dict[x] = len(self.list)
self.list.append(x)
def pop(self, x=None):
if x is None:
i = randint(0,len(self.list)-1)
x = self.list[i]
else:
i = self.dict[x]
self.list[i] = self.list[-1]
self.dict[self.list[-1]] = i
self.list.pop()
self.dict.pop(x)
return x
def __contains__(self, x):
return x in self.dict
def __iter__(self):
return iter(self.list)
def __repr__(self):
return "{" + ", ".join(str(x) for x in self.list) + "}"
def __len__(self):
return len(self.list)
Yes, I'd implement an "ordered set" in much the same way you did - and use a list as an internal data structure.
However, I'd inherit straight from "set" and just keep track of the added items in an internal list (as you did) - and leave the methods I don't use alone.
Maybe add a "sync" method to update the internal list whenever the set is updated by set-specific operations, like the *_update methods.
That if using an "ordered dict" does not cover your use cases. (I just found that trying to cast ordered_dict keys to a regular set is not optmized, so if you need set operations on your data that is not an option)
If you don't mind only supporting comparable elements, then you could use blist.sortedset
.
Can't we implement a new class inheriting from set
with some (hackish) modifications that enable us to retrieve a random element from the list with O(1) lookup time? Btw, on Python 2.x you should inherit from object
, i.e. use class randomSet(object)
. Also PEP8 is something to consider for you :-)
Edit: For getting some ideas of what hackish solutions might be capable of, this thread is worth reading: http://python.6.n6.nabble.com/Get-item-from-set-td1530758.html
I think the best way to do this would be to use the MutableSet abstract base class in collections
. Inherit from MutableSet
, and then define add
, discard
, __len__,
__iter__
, and __contains__
; also rewrite __init__
to optionally accept a sequence, just like the set
constructor does. MutableSet
provides built-in definitions of all other set
methods based on those methods. That way you get the full set
interface cheaply. (And if you do this, addIterable
is defined for you, under the name extend
.)
discard
in the standard set
interface appears to be what you have called delete
here. So rename delete
to discard
. Also, instead of having a separate popRandom
method, you could just define popRandom
like so:
def popRandom(self):
item = self.getRandom()
self.discard(item)
return item
That way you don't have to maintain two separate item removal methods.
Finally, in your item removal method (delete
now, discard
according to the standard set interface), you don't need an if statement. Instead of testing whether index == len(self.list) - 1
, simply swap the final item in the list with the item at the index of the list to be popped, and make the necessary change to the reverse-indexing dictionary. Then pop the last item from the list and remove it from the dictionary. This works whether index == len(self.list) - 1
or not:
def discard(self, item):
if item in self.dict:
index = self.dict[item]
self.list[index], self.list[-1] = self.list[-1], self.list[index]
self.dict[self.list[index]] = index
del self.list[-1] # or in one line:
del self.dict[item] # del self.dict[self.list.pop()]
One approach you could take is to derive a new class from set
which salts itself with random objects of a type derived from int
.
You can then use pop
to select a random element, and if it is not of the salt type, reinsert and return it, but if it is of the salt type, insert a new, randomly-generated salt object (and pop to select a new object).
This will tend to alter the order in which objects are selected. On average, the number of attempts will depend on the proportion of salting elements, i.e. amortised O(k) performance.