Random sequence iteration in O(1) memory?

前端 未结 6 1495
[愿得一人]
[愿得一人] 2021-02-19 07:58

Say you want to iterate over a sequence [0 to n] in a random order, visiting every element exactly once. Is there any way to do this in O(1) memory, i.e. without creati

相关标签:
6条回答
  • 2021-02-19 08:06

    I've just built a structure for this sort of thing - I generate a Heap structure (min or max, doesn't matter). But for the comparison, instead of using a key value, I use a random number. Items inserted into the heap are thus placed in random order. Then you can either return the array that forms the heap's base structure (which will be randomly ordered), or you can pop the elements off one by one, and get them back in random order. If this type of your container is used as your primary storage (rather than having an array separate from the heap), there is no additional memory complexity, since it's just an array anyhow. Time complexity is O(log N) for insertion, O(log N) for popping the top element. Shuffling is as simple as popping and reinserting each element, O(N log N).

    I even built a fancy Enumerator (it's C#, but you could do the same with a C++ Iterator) that auto-shuffles after you've iterated through to the end. That means that every time you can iterate through the list (without popping) multiple times and get a different order every time, at the cost of an O(N log N) shuffle after each full iteration. (Think like a deck of cards. After every card has gone to the discard pile, you reshuffle the deck so as to not get them in the same order next time around.)

    0 讨论(0)
  • 2021-02-19 08:14

    No there is not, think about it, somewhere the program has to remember the places it has visited. If there is an iterator that can randomly access them all, the iterators internals would have to keep track of this somehow and you would still be using the memory.

    0 讨论(0)
  • 2021-02-19 08:20

    As with most algorithmic problems, there is a time-space trade-off; this can be solved in O(1) space if you're happy to use O(n^2) time to generate all the permutations. Aside from a couple of temporary variables, the only storage this requires is the random number seed itself (or, in this case, the PRNG object), since that is sufficient to regenerate the sequence of pseudo-random numbers.

    Note that you have to give this function the same PRNG on every call, and you can't use it for any other purpose.

    #include <random>
    
    template<typename PRNG, typename INT>
    INT random_permutation_element(INT k, INT n, PRNG prng) {
      typedef std::uniform_int_distribution<INT> dis;
      INT i = 0;
      for (; i < k; ++i) dis(0, i)(prng);
      INT result = dis(0, i)(prng);
      for (++i; i < n; ++i) if (dis(0, i)(prng) <= result) ++result;
      return result;
    }
    

    Here's a quick and dirty harness. ./test 1000 3 generates 1000 complete permutations of length three; ./test 10 1000000 0 5 generates the first five elements of each of 10 permutations of length one million.

    #include <iostream>
    
    int main(int argc, char** argv) {
      std::random_device rd;
      std::mt19937 seed_gen(rd());
      int count = std::stoi(argv[1]);
      int size = std::stoi(argv[2]);
      int seglow = 0;
      int seglim = size;
      if (argc > 3) seglow = std::stoi(argv[3]);
      if (argc > 4) seglim = std::stoi(argv[4]);
      while (count-- > 0) {
        std::mt19937 prng(seed_gen());
        for (int i = seglow; i < seglim; ++i)
          std::cout << random_permutation_element(i, size, prng)
                    << (i < seglim - 1 ? ' ' : '\n');
      }
      return 0;
    }
    

    There is a faster way to do this if you're unlikely to finish any given permutation, but this way of writing it looks nicer, and is maybe easier to understand. (The other way is to generate the numbers in the opposite order, which means you can stop after you've generated k of them but you have to do it twice, first to get result and then to adjust it.)

    0 讨论(0)
  • 2021-02-19 08:28

    Well... think about that for a second. How would you 'know' which elements had been visited before?

    Short answer: you can't. (Edit Well, not unless you count stateless Pseudo-random generators, but as you have stated yourself in the command, that doesn't seem feasible for the general case)

    Depending on the actual sequence, it might, however, be feasible to 'mark' elements as visited _in-place_ thus technically requiring O(n) storage, but no extra storage for the algorithm

    Example:

    const int VISITED_BIT = 0x8000; // arbitrary example
    
    bool extract(int i) { return (i & ~VISITED_BIT); }    
    bool visited(int i) { return (i & VISITED_BIT); }    
    bool markvisited(int& i) { i |= VISITED_BIT); }
    
    int main()
    {
        std::vector<int> v = {2,3,4,5,6};
    
        int remain = v.size();
        while (remain>0)
        {
            size_t idx = rand(); // or something
            if (visited(v[idx]))
                continue;
    
            std::cout << "processing item #" << idx << ": " << extract(v[idx]) << "\n";
            markvisited(v[idx]);
            remain--;
        }
    }
    
    0 讨论(0)
  • 2021-02-19 08:30

    If you could mutate the sequence in-place, you could simply repeatedly draw a random number from 0-N, and then erase the element you visited, or swap it to the end, or such schemes.

    0 讨论(0)
  • 2021-02-19 08:31

    In theory, if you built a random number generator whose period was exactly n, and covered all values in 0..n, then running through this once would give you what you like.

    Of course, this may not be a general solution, at least if you are looking for something dynamic, since you would have to pre-create the PRNG and how you do this depends on n.

    0 讨论(0)
提交回复
热议问题