Efficiently generating random bytes of data in C++11/14

前端 未结 4 1153
终归单人心
终归单人心 2021-02-12 23:23

My requirement is to generate random bytes of data (not random numbers) aka uniformly distributed bits.

As such I was wondering what are the correct/efficient w

相关标签:
4条回答
  • 2021-02-13 00:23

    I think the key to efficiency is to get the maximum number of values from every random number that is generated. Here's a small tweak to your original code that does that.

    #include <cstdint>
    #include <vector>
    #include <random>
    
    int main()
    {
       std::random_device rd;
       std::uniform_int_distribution<uint32_t> dist(0,0xFFFFFFFFu);
       std::vector<char> data(1000);
       int offset = 0;
       uint32_t bits = 0;
       for (char& d : data)
       {
          if (offset == 0)
             bits = dist(rd);
          d = static_cast<char>(bits & 0xFF);
          bits >>= 8;
          if (++offset >= 4)
             offset = 0;
       }
       return 0;
    }
    

    If getting 4 bytes from every 32 bits doesn't work, try 3 bytes from 24 bits.

    0 讨论(0)
  • 2021-02-13 00:24

    To answer your question: You can't.

    The standard does not allow std::uniform_int_distribution to be templated on char, signed char, or unsigned char. Some believe that this is a defect in the standard, but it is the case.

    You can simply template std::uniform_int_distribution on unsigned short, and set its min/max range to std::numeric_limits<unsigned char>::min() and std::numeric_limits<unsigned char>::max(), and then simply assign the result to an unsigned char.

    From the standard:

    Throughout this subclause 26.5, the effect of instantiating a template:

    [...]

    e) that has a template type parameter named IntType is undefined unless the corresponding template argument is cv-unqualified and is one of short, int, long, long long, unsigned short, unsigned int, unsigned long, or unsigned long long.

    §26.5.1.1 [rand.req.genl]

    Moreover:

    You should use std::mt19937 to actually generate your random bytes. std::random_device is liable to be slow, and likely produces entropy with statistical properties (i.e. suitability for use in cryptography) that you don't need.

    That said, you will need to seed your std::mt19937. You can do this with a std::random_device and a std::seed_seq.

    Note that if you don't use a std::seed_seq to seed your std::mt19937, your std::mt19937 will be left with many, many zeroes in its internal state, and it will therefore take it quite a while to "warm up".

    For more information on "warm up", see here.

    0 讨论(0)
  • 2021-02-13 00:27

    Distributions take random bits and turn them into numbers. If you actually want random bits then you want to use an engine:

    In particular, those requirements specify the algorithmic interface for types and objects that produce sequences of bits in which each possible bit value is uniformly likely.3

    A single call to a URNG object is allowed to produce and deliver many (typically 32 or more) bits, returning these bits as a single packaged value of an unsigned integer type.4 N3847

    random_device happens to be specified such that accessing uniformly distributed bits is easy:

    std::random_device engine;
    unsigned x = engine(); // sizeof(unsigned) * CHAR_BIT random bits
    

    Note that other engines may not make it quite as easy to get uniformly random bits as random_device, due to returning fewer bits than their result_type can hold or even by effectively returning fractional bits.

    If your concern is that unsigned's size is implementation defined and so random_device returns an implementation defined number of bits, you can write an adapter that either collects enough bits before giving them to you, or one that will give you just enough bits and cache the rest for your next request. (You can also do this to handle other engines which exhibit the previously mentioned issues.)

    0 讨论(0)
  • 2021-02-13 00:30

    What you're looking for is the std::independent_bits_engine adaptor:

    #include <vector>
    #include <random>
    #include <climits>
    #include <algorithm>
    #include <functional>
    
    using random_bytes_engine = std::independent_bits_engine<
        std::default_random_engine, CHAR_BIT, unsigned char>;
    
    int main()
    {
        random_bytes_engine rbe;
        std::vector<unsigned char> data(1000);
        std::generate(begin(data), end(data), std::ref(rbe));
    }
    

    Note that the accepted answer is not strictly correct in a general case – random engines produce unsigned values belonging to a range [min(), max()], which doesn't necessarily cover all possible values of the result type (for instance, std::minstd_rand0::min() == 1) and thus you may get random bytes that are not uniformly distributed if using an engine directly. However, for std::random_device the range is [std::numeric_limits<result_type>::min(), std::numeric_limits<result_type>::max()], so this particular engine would also work well without the adaptor.

    0 讨论(0)
提交回复
热议问题