How do I create a list of random numbers without duplicates?

前端 未结 17 2238
灰色年华
灰色年华 2020-11-22 13:30

I tried using random.randint(0, 100), but some numbers were the same. Is there a method/module to create a list unique random numbers?

Note: The fol

相关标签:
17条回答
  • 2020-11-22 13:37

    You can first create a list of numbers from a to b, where a and b are respectively the smallest and greatest numbers in your list, then shuffle it with Fisher-Yates algorithm or using the Python's random.shuffle method.

    0 讨论(0)
  • 2020-11-22 13:37

    Linear Congruential Pseudo-random Number Generator

    O(1) Memory

    O(k) Operations

    This problem can be solved with a simple Linear Congruential Generator. This requires constant memory overhead (8 integers) and at most 2*(sequence length) computations.

    All other solutions use more memory and more compute! If you only need a few random sequences, this method will be significantly cheaper. For ranges of size N, if you want to generate on the order of N unique k-sequences or more, I recommend the accepted solution using the builtin methods random.sample(range(N),k) as this has been optimized in python for speed.

    Code

    # Return a randomized "range" using a Linear Congruential Generator
    # to produce the number sequence. Parameters are the same as for 
    # python builtin "range".
    #   Memory  -- storage for 8 integers, regardless of parameters.
    #   Compute -- at most 2*"maximum" steps required to generate sequence.
    #
    def random_range(start, stop=None, step=None):
        import random, math
        # Set a default values the same way "range" does.
        if (stop == None): start, stop = 0, start
        if (step == None): step = 1
        # Use a mapping to convert a standard range into the desired range.
        mapping = lambda i: (i*step) + start
        # Compute the number of numbers in this range.
        maximum = (stop - start) // step
        # Seed range with a random integer.
        value = random.randint(0,maximum)
        # 
        # Construct an offset, multiplier, and modulus for a linear
        # congruential generator. These generators are cyclic and
        # non-repeating when they maintain the properties:
        # 
        #   1) "modulus" and "offset" are relatively prime.
        #   2) ["multiplier" - 1] is divisible by all prime factors of "modulus".
        #   3) ["multiplier" - 1] is divisible by 4 if "modulus" is divisible by 4.
        # 
        offset = random.randint(0,maximum) * 2 + 1      # Pick a random odd-valued offset.
        multiplier = 4*(maximum//4) + 1                 # Pick a multiplier 1 greater than a multiple of 4.
        modulus = int(2**math.ceil(math.log2(maximum))) # Pick a modulus just big enough to generate all numbers (power of 2).
        # Track how many random numbers have been returned.
        found = 0
        while found < maximum:
            # If this is a valid value, yield it in generator fashion.
            if value < maximum:
                found += 1
                yield mapping(value)
            # Calculate the next value in the sequence.
            value = (value*multiplier + offset) % modulus
    

    Usage

    The usage of this function "random_range" is the same as for any generator (like "range"). An example:

    # Show off random range.
    print()
    for v in range(3,6):
        v = 2**v
        l = list(random_range(v))
        print("Need",v,"found",len(set(l)),"(min,max)",(min(l),max(l)))
        print("",l)
        print()
    

    Sample Results

    Required 8 cycles to generate a sequence of 8 values.
    Need 8 found 8 (min,max) (0, 7)
     [1, 0, 7, 6, 5, 4, 3, 2]
    
    Required 16 cycles to generate a sequence of 9 values.
    Need 9 found 9 (min,max) (0, 8)
     [3, 5, 8, 7, 2, 6, 0, 1, 4]
    
    Required 16 cycles to generate a sequence of 16 values.
    Need 16 found 16 (min,max) (0, 15)
     [5, 14, 11, 8, 3, 2, 13, 1, 0, 6, 9, 4, 7, 12, 10, 15]
    
    Required 32 cycles to generate a sequence of 17 values.
    Need 17 found 17 (min,max) (0, 16)
     [12, 6, 16, 15, 10, 3, 14, 5, 11, 13, 0, 1, 4, 8, 7, 2, ...]
    
    Required 32 cycles to generate a sequence of 32 values.
    Need 32 found 32 (min,max) (0, 31)
     [19, 15, 1, 6, 10, 7, 0, 28, 23, 24, 31, 17, 22, 20, 9, ...]
    
    Required 64 cycles to generate a sequence of 33 values.
    Need 33 found 33 (min,max) (0, 32)
     [11, 13, 0, 8, 2, 9, 27, 6, 29, 16, 15, 10, 3, 14, 5, 24, ...]
    
    0 讨论(0)
  • 2020-11-22 13:42

    This will return a list of 10 numbers selected from the range 0 to 99, without duplicates.

    import random
    random.sample(range(100), 10)
    

    With reference to your specific code example, you probably want to read all the lines from the file once and then select random lines from the saved list in memory. For example:

    all_lines = f1.readlines()
    for i in range(50):
        lines = random.sample(all_lines, 40)
    

    This way, you only need to actually read from the file once, before your loop. It's much more efficient to do this than to seek back to the start of the file and call f1.readlines() again for each loop iteration.

    0 讨论(0)
  • 2020-11-22 13:43

    If you need to sample extremely large numbers, you cannot use range

    random.sample(range(10000000000000000000000000000000), 10)
    

    because it throws:

    OverflowError: Python int too large to convert to C ssize_t
    

    Also, if random.sample cannot produce the number of items you want due to the range being too small

     random.sample(range(2), 1000)
    

    it throws:

     ValueError: Sample larger than population
    

    This function resolves both problems:

    import random
    
    def random_sample(count, start, stop, step=1):
        def gen_random():
            while True:
                yield random.randrange(start, stop, step)
    
        def gen_n_unique(source, n):
            seen = set()
            seenadd = seen.add
            for i in (i for i in source() if i not in seen and not seenadd(i)):
                yield i
                if len(seen) == n:
                    break
    
        return [i for i in gen_n_unique(gen_random,
                                        min(count, int(abs(stop - start) / abs(step))))]
    

    Usage with extremely large numbers:

    print('\n'.join(map(str, random_sample(10, 2, 10000000000000000000000000000000))))
    

    Sample result:

    7822019936001013053229712669368
    6289033704329783896566642145909
    2473484300603494430244265004275
    5842266362922067540967510912174
    6775107889200427514968714189847
    9674137095837778645652621150351
    9969632214348349234653730196586
    1397846105816635294077965449171
    3911263633583030536971422042360
    9864578596169364050929858013943
    

    Usage where the range is smaller than the number of requested items:

    print(', '.join(map(str, random_sample(100000, 0, 3))))
    

    Sample result:

    2, 0, 1
    

    It also works with with negative ranges and steps:

    print(', '.join(map(str, random_sample(10, 10, -10, -2))))
    print(', '.join(map(str, random_sample(10, 5, -5, -2))))
    

    Sample results:

    2, -8, 6, -2, -4, 0, 4, 10, -6, 8
    -3, 1, 5, -1, 3
    
    0 讨论(0)
  • 2020-11-22 13:45

    to sample integers without replacement between minval and maxval:

    import numpy as np
    
    minval, maxval, n_samples = -50, 50, 10
    generator = np.random.default_rng(seed=0)
    samples = generator.permutation(np.arange(minval, maxval))[:n_samples]
    
    # or, if minval is 0,
    samples = generator.permutation(maxval)[:n_samples]
    

    with jax:

    import jax
    
    minval, maxval, n_samples = -50, 50, 10
    key = jax.random.PRNGKey(seed=0)
    samples = jax.random.shuffle(key, jax.numpy.arange(minval, maxval))[:n_samples]
    
    0 讨论(0)
  • 2020-11-22 13:47

    You can use the shuffle function from the random module like this:

    import random
    
    my_list = list(xrange(1,100)) # list of integers from 1 to 99
                                  # adjust this boundaries to fit your needs
    random.shuffle(my_list)
    print my_list # <- List of unique random numbers
    

    Note here that the shuffle method doesn't return any list as one may expect, it only shuffle the list passed by reference.

    0 讨论(0)
提交回复
热议问题