Efficiently generating unique pairs of integers

前端 未结 3 1315
鱼传尺愫
鱼传尺愫 2020-12-30 06:45

In MATLAB, I would like to generate n pairs of random integers in the range [1, m], where each pair is unique. For uniqueness, I consider the order

相关标签:
3条回答
  • 2020-12-30 07:01

    The following code does what you need:

    n = 10000;
    m = 500;
    my_list = unique(sort(round(rand(n,2)*m),2),'rows');
    my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
    %temp = my_list;    %In case you want to check what you initially generated.
    while(size(my_list,1)~=n)
        %my_list = unique([my_list;sort(round(rand(1,2)*m),2)],'rows');
        %Changed as per @jucestain's suggestion.
        my_list = unique([my_list;sort(round(rand((n-size(my_list,1)),2)*m),2)],'rows');
        my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
    end
    
    0 讨论(0)
  • 2020-12-30 07:06

    This is more of a general approach rather then a matlab solution.

    How about you do the following first you fill a vector like the following.

    x[n] = rand()
    x[n + 1] = x[n] + rand() %% where rand can be equal to 0.
    

    Then you do the following again

    x[n][y] = x[n][y] + rand() + 1
    

    And if

    x[n] == x[n+1]
    

    You would make sure that the same pair is not already selected.

    After you are done you can run a permutation algorithm on the matrix if you want them to be randomly spaced.

    This approach will give you all the possibility or 2 integer pairs, and it runs in O(n) where n is the height of the matrix.

    0 讨论(0)
  • 2020-12-30 07:08

    Easy, peasy, when viewed in the proper way.

    You wish to generate n pairs of integers, [p,q], such that p and q lie in the interval [1,m], and p

    How many possible pairs are there? The total number of pairs is just m*(m-1)/2. (I.e., the sum of the numbers from 1 to m-1.)

    So we could generate n random integers in the range [1,m*(m-1)/2]. Randperm does this nicely. (Older matlab releases do not allow the second argument to randperm.)

    k = randperm(m/2*(m-1),n);
    

    (Note that I've written this expression with m in a funny way, dividing by 2 in perhaps a strange place. This avoids precision problems for some values of m near the upper limits.)

    Now, if we associate each possible pair [p,q] with one of the integers in k, we can work backwards, from the integers generated in k, to a pair [p,q]. Thus the first few pairs in that list are:

    {[1,2], [1,3], [2,3], [1,4], [2,4], [3,4], ..., [m-1,m]}
    

    We can think of them as the elements in a strictly upper triangular array of size m by m, thus those elements above the main diagonal.

    q = floor(sqrt(8*(k-1) + 1)/2 + 1/2);
    p = k - q.*(q-1)/2;
    

    See that these formulas recover p and q from the unrolled elements in k. We can convince ourselves that this does indeed work, but perhaps a simple way here is just this test:

    k = 1:21;
    q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
    p = k - (q-1).*(q-2)/2;
    [k;p;q]'
    
    ans =
         1     1     2
         2     1     3
         3     2     3
         4     1     4
         5     2     4
         6     3     4
         7     1     5
         8     2     5
         9     3     5
        10     4     5
        11     1     6
        12     2     6
        13     3     6
        14     4     6
        15     5     6
        16     1     7
        17     2     7
        18     3     7
        19     4     7
        20     5     7
        21     6     7
    

    Another way of testing it is to show that all pairs get generated for a small case.

    m = 5;
    n = 10;
    k = randperm(m/2*(m-1),n);
    q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
    p = k - (q-1).*(q-2)/2;
    
    sortrows([p;q]',[2 1])
    ans =
         1     2
         1     3
         2     3
         1     4
         2     4
         3     4
         1     5
         2     5
         3     5
         4     5
    

    Yup, it looks like everything works perfectly. Now try it for some large numbers for m and n to test the time used.

    tic
    m = 1e6;
    n = 100000;
    k = randperm(m/2*(m-1),n);
    q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
    p = k - (q-1).*(q-2)/2;
    toc
    
    Elapsed time is 0.014689 seconds.
    

    This scheme will work for m as large as roughly 1e8, before it fails due to precision errors in double precision. The exact limit should be m no larger than 134217728 before m/2*(m-1) exceeds 2^53. A nice feature is that no rejection for repeat pairs need be done.

    0 讨论(0)
提交回复
热议问题