What are the most secure sources of entropy to seed a random number generator? This question is language and platform independent and applies to any machine on a network.
The most secure seed is the one which has the highest level of entropy (or most number of bits that can not be predicted). Time is a bad seed generally because it has a small entropy (ie. if you know when the transaction took place you can guess the time stamp to within a few bits). Hardware entropy sources (e.g. from decay processes) are very good because they yield one bit of entropy for every bit of seed.
Usually a hardware source is impractical for most needs, so this leads you to rely on mixing a number of low quality entropy sources to produce a higher one. Typically this is done by estimating the number of bits of entropy for each sample and then gathering enough samples so that the search space for the entropy source is large enough that it is impractical for an attacker to search (128 bits is a good rule of thumb).
Some sources which you can use are: current time in microseconds (typically very low entropy of 1/2 a bit depending on resolution and how easy it is for an attacker to guess), interarrival time of UI events etc.
Operating system sources such as /dev/random and the Windows CAPI random number generator often provide a pre-mixed source of these low-entropy sources, for example the Windows generator CryptGenRandom includes:
Some PRNGs have built-in strategies to allow the mixing of entropy from low quality sources to produce high quality results. One very good generator is the Fortuna generator. It specifically uses strategies which limit the risk if any of the entropy sources are compromised.
The answer is /dev/random
on a Linux machine. This is very close to a "real" random number generator, where as /dev/urandom
can be generated by a PRNG if the entropy pool runs dry. The following quote is taken from the Linux kernel's random.c This entire file is a beautiful read, plenty of comments. The code its self was adopted from from PGP. Its beauty is not bounded by the constraints of C, which is marked by global structs wrapped by accessors. It is a simply awe inspiring design.
This routine gathers environmental noise from device drivers, etc., and returns good random numbers, suitable for cryptographic use. Besides the obvious cryptographic uses, these numbers are also good for seeding TCP sequence numbers, and other places where it is desirable to have numbers which are not only random, but hard to predict by an attacker.
Theory of operation
Computers are very predictable devices. Hence it is extremely hard
to produce truly random numbers on a computer --- as opposed to
pseudo-random numbers, which can easily generated by using a
algorithm. Unfortunately, it is very easy for attackers to guess the sequence of pseudo-random number generators, and for some
applications this is not acceptable. So instead, we must try to gather "environmental noise" from the computer's environment, which must be hard for outside attackers to observe, and use that to generate random numbers. In a Unix environment, this is best done from inside the kernel.Sources of randomness from the environment include inter-keyboard
timings, inter-interrupt timings from some interrupts, and other events which are both (a) non-deterministic and (b) hard for an outside observer to measure. Randomness from these sources are added to an "entropy pool", which is mixed using a CRC-like function. This is not cryptographically strong, but it is adequate assuming the randomness is not chosen maliciously, and it is fast enough that the overhead of doing it on every interrupt is very reasonable. As random bytes are mixed into the entropy pool, the routines keep an estimate of how many bits of randomness have been stored into the random number generator's internal state.When random bytes are desired, they are obtained by taking the SHA
hash of the contents of the "entropy pool". The SHA hash avoids exposing the internal state of the entropy pool. It is believed to be computationally infeasible to derive any useful information about the input of SHA from its output. Even if it is possible to analyze SHA in some clever way, as long as the amount of data returned from the generator is less than the inherent entropy in
the pool, the output data is totally unpredictable. For this reason, the routine decreases its internal estimate of how many bits of "true randomness" are contained in the entropy pool as it outputs random numbers. If this estimate goes to zero, the routine can still generate random numbers; however, an attacker may (at least in theory) be able to infer the future output of the generator from prior outputs. This requires successful cryptanalysis of SHA, which is not believed to be feasible, but there is a remote possibility. Nonetheless, these numbers should be useful for the vast majority of purposes....
Use random.org they claim to offer true random numbers to anyone on the Internet and they also have an HTTP API which you can use. They offer both free and paid services.
disclaimer: i am not in any way affiliated with random.org
With Linux, the answer is /dev/random (in Windows I think the equivallent is called CryptGenRand).
However, in a cloud environment /dev/random can be severely depleted and might not have enough entropy to answer you request.
To solve that problem, our company is developping a true random number generator appliance that can provide good random numbers (of quantum origin) to thousands of servers and VM simultaneously. If the appliance is installed in the LAN of your cloud datacenter, all you would need is our deamon running in your machine. This deamon monitors /dev/random entropy level and when entropy is needed makes a request to the appliance (over the network) and puts the received random data in the kernel's entropy pool.
If you want to know more about our solution, please visit our website (www.sqrtech.com) or contact us at info@sqrtech.com.
Julien
James is correct. In addition, there is hardware that you can purchase that will give you random data. Not sure where I saw it, but I think I read that some sound cards come with such hardware.
You can also use a site like http://www.random.org/
Your most secure methods will come from nature. That is to say, something that happens outside of your computer system and beyond our ability to predict it's patterns.
For instance, many researchers into Cryptographically secure PRNGs will use radioactive decay as a model, others might look into fractals, and so forth. There are existing means of creating true RNGs
One of my favorite ways of implementing a PRNG is from user interaction with a computer. For instance, this post was not something that could be pre-determined by forward-engineering from my past series of posts. Where I left my mouse on my screen is very random, the trail it made is also random. Seeing from user-interactions is. Abuse from the means of providing specific input such that specific numbers are generated could be mitigated by using a 'swarm' of user inputs and calculating it's 'vector', as long as you do not have every user in your system as an Eve, you should be fine. This is not suitable for many applications, as your pool of numbers is directly proportional to user input. Implementing this may have it's own issues.
People interested in RNG have already done things such as:
Secure seeds come from nature.
edit: Based on what you're looking at doing, I might suggest using an aggregation of your cloud server's EDG.