I tried using random.randint(0, 100)
, but some numbers were the same. Is there a method/module to create a list unique random numbers?
Note: The fol
import random
sourcelist=[]
resultlist=[]
for x in range(100):
sourcelist.append(x)
for y in sourcelist:
resultlist.insert(random.randint(0,len(resultlist)),y)
print (resultlist)
If the list of N numbers from 1 to N is randomly generated, then yes, there is a possibility that some numbers may be repeated.
If you want a list of numbers from 1 to N in a random order, fill an array with integers from 1 to N, and then use a Fisher-Yates shuffle or Python's random.shuffle().
You can use Numpy library for quick answer as shown below -
Given code snippet lists down 6 unique numbers between the range of 0 to 5. You can adjust the parameters for your comfort.
import numpy as np
import random
a = np.linspace( 0, 5, 6 )
random.shuffle(a)
print(a)
Output
[ 2. 1. 5. 3. 4. 0.]
It doesn't put any constraints as we see in random.sample as referred here.
Hope this helps a bit.
The answer provided here works very well with respect to time as well as memory but a bit more complicated as it uses advanced python constructs such as yield. The simpler answer works well in practice but, the issue with that answer is that it may generate many spurious integers before actually constructing the required set. Try it out with populationSize = 1000, sampleSize = 999. In theory, there is a chance that it doesn't terminate.
The answer below addresses both issues, as it is deterministic and somewhat efficient though currently not as efficient as the other two.
def randomSample(populationSize, sampleSize):
populationStr = str(populationSize)
dTree, samples = {}, []
for i in range(sampleSize):
val, dTree = getElem(populationStr, dTree, '')
samples.append(int(val))
return samples, dTree
where the functions getElem, percolateUp are as defined below
import random
def getElem(populationStr, dTree, key):
msd = int(populationStr[0])
if not key in dTree.keys():
dTree[key] = range(msd + 1)
idx = random.randint(0, len(dTree[key]) - 1)
key = key + str(dTree[key][idx])
if len(populationStr) == 1:
dTree[key[:-1]].pop(idx)
return key, (percolateUp(dTree, key[:-1]))
newPopulation = populationStr[1:]
if int(key[-1]) != msd:
newPopulation = str(10**(len(newPopulation)) - 1)
return getElem(newPopulation, dTree, key)
def percolateUp(dTree, key):
while (dTree[key] == []):
dTree[key[:-1]].remove( int(key[-1]) )
key = key[:-1]
return dTree
Finally, the timing on average was about 15ms for a large value of n as shown below,
In [3]: n = 10000000000000000000000000000000
In [4]: %time l,t = randomSample(n, 5)
Wall time: 15 ms
In [5]: l
Out[5]:
[10000000000000000000000000000000L,
5731058186417515132221063394952L,
85813091721736310254927217189L,
6349042316505875821781301073204L,
2356846126709988590164624736328L]
The problem with the set based approaches ("if random value in return values, try again") is that their runtime is undetermined due to collisions (which require another "try again" iteration), especially when a large amount of random values are returned from the range.
An alternative that isn't prone to this non-deterministic runtime is the following:
import bisect
import random
def fast_sample(low, high, num):
""" Samples :param num: integer numbers in range of
[:param low:, :param high:) without replacement
by maintaining a list of ranges of values that
are permitted.
This list of ranges is used to map a random number
of a contiguous a range (`r_n`) to a permissible
number `r` (from `ranges`).
"""
ranges = [high]
high_ = high - 1
while len(ranges) - 1 < num:
# generate a random number from an ever decreasing
# contiguous range (which we'll map to the true
# random number).
# consider an example with low=0, high=10,
# part way through this loop with:
#
# ranges = [0, 2, 3, 7, 9, 10]
#
# r_n :-> r
# 0 :-> 1
# 1 :-> 4
# 2 :-> 5
# 3 :-> 6
# 4 :-> 8
r_n = random.randint(low, high_)
range_index = bisect.bisect_left(ranges, r_n)
r = r_n + range_index
for i in xrange(range_index, len(ranges)):
if ranges[i] <= r:
# as many "gaps" we iterate over, as much
# is the true random value (`r`) shifted.
r = r_n + i + 1
elif ranges[i] > r_n:
break
# mark `r` as another "gap" of the original
# [low, high) range.
ranges.insert(i, r)
# Fewer values possible.
high_ -= 1
# `ranges` happens to contain the result.
return ranges[:-1]
If you wish to ensure that the numbers being added are unique, you could use a Set object
if using 2.7 or greater, or import the sets module if not.
As others have mentioned, this means the numbers are not truly random.