Saw Tools

Randomness in computing: PRNG, CSPRNG, and true randomness

From predictable pseudo-randomness to quantum entropy: understanding random number generators, their cryptographic guarantees, their pitfalls, and their legal implications.

What is randomness, really?

The word "random" conceals three very different realities depending on the discipline using it. In philosophy, randomness refers to the absence of deterministic causation — a position compatible with quantum indeterminism but difficult to reconcile with classical physics. In mathematics and statistics, a process is called random if its probability distribution satisfies certain properties (uniformity, independence, absence of autocorrelation) — an operational definition, not an ontological one. In computer science, the question is more pragmatic: how do you produce a sequence of bits that is sufficiently unpredictable for the intended use, whether that's a video game or an encryption key?

The fundamental paradox of computing is this: a computer is a deterministic machine. Starting from the same initial state, it always produces exactly the same results. It cannot, by itself, generate "true" randomness in the philosophical sense. What it can do is convincingly simulate randomness (PRNG), or exploit genuinely unpredictable physical phenomena to approximate it (CSPRNG fed by hardware entropy sources, or true quantum randomness).

This distinction is far from academic. In 2012, security researchers showed that 0.2% of RSA public keys on the internet shared a common prime factor — evidence that two separate machines had used the same weak random seed when generating their keys (Heninger et al., "Mining Your Ps and Qs," USENIX Security 2012). The result: those keys, supposed to be unbreakable, could be factored in seconds. Poorly understood randomness has real, measurable consequences.

Statistical randomness vs. cryptographic unpredictability

A crucial subtlety: a sequence can be statistically random (uniform, without detectable correlation) without being cryptographically secure. Why? Because cryptographic security demands unpredictability even for an adversary who knows the algorithm being used. A PRNG can produce a perfectly uniform distribution while being entirely predictable if the adversary knows its internal state. This distinction is at the heart of the difference between a PRNG and a CSPRNG.

PRNG: pseudo-random number generators

A pseudo-random number generator (PRNG) is a deterministic algorithm that, starting from an initial value called a seed, produces a sequence of numbers that statistically resembles randomness. The principle is straightforward: an internal state is initialized with the seed, then a transition function is applied repeatedly to produce outputs and update the state.

The Linear Congruential Generator (LCG)

The oldest and simplest PRNG is the Linear Congruential Generator (LCG), whose formula is:

X(n+1) = (a * X(n) + c) mod m

Where a, c, and m are carefully chosen constants (the POSIX standard values are a = 1103515245, c = 12345, m = 2³¹). This generator is fast and simple to implement, but catastrophically predictable: knowing a single output is enough to reconstruct the complete state and predict all subsequent values. It is the implementation behind C's rand() function on most Unix systems — perfectly suited for simulations or games, absolutely forbidden for any cryptographic application.

The Mersenne Twister

Invented in 1997 by Matsumoto and Nishimura, the Mersenne Twister (MT19937) is the most widely used PRNG in the world. It has an astronomically long period of 2¹⁹⁹³⁷ − 1 (approximately 10⁶⁰⁰¹ values before repeating), passes all standard statistical tests, and generates uniformly distributed numbers on 32 or 64 bits. It is the default implementation of random.random() in Python, java.util.Random in Java, and most simulation environments.

However, the Mersenne Twister has an internal state of 624 32-bit integers (2.5 KB). By observing 624 consecutive outputs, an adversary can reconstruct the complete state and predict all future outputs — and even reconstruct past values. This attack is documented and publicly implemented (see the mersenne-twister-predictor library on GitHub), making MT completely unsuitable for cryptography. Python's documentation states this explicitly: "The underlying Mersenne Twister algorithm is not suitable for all purposes."

Math.random() in JavaScript

JavaScript's Math.random() returns a floating-point number between 0 (inclusive) and 1 (exclusive). Its specification mandates no specific algorithm — each JS engine is free to implement whatever it wants. In practice, V8 (Chrome/Node.js) has used an algorithm called xorshift128+ since 2015, with a state of only 128 bits. Researchers showed in 2017 (Fröhlich, Grossman) that by observing a reasonable number of outputs through the API, it is possible to reconstruct the internal state and predict all future values. Math.random() is convenient for games, animations, and simulations — but using it to generate tokens, keys, or unique identifiers is a serious security flaw.

Period and statistical properties

The period of a PRNG is the length of the sequence before it repeats. For a standard LCG, it is at best around 2³¹ ≈ 2 billion. For the Mersenne Twister, 2¹⁹⁹³⁷. In cryptography, period is not the decisive criterion — unpredictability is. A PRNG can have a colossal period while remaining completely predictable. Statistical tests like the Diehard battery or NIST SP 800-22 evaluate statistical quality, not cryptographic security.

CSPRNG: cryptographically secure pseudo-random number generators

A cryptographically secure pseudo-random number generator (CSPRNG) is a PRNG that satisfies additional security properties formalized within the framework of computational complexity theory.

The three fundamental properties

A CSPRNG must satisfy:

  1. Indistinguishability (next-bit test): no polynomial-time algorithm should be able to predict the next bit with probability significantly greater than 50%, even knowing all previous outputs. This property was formalized by Blum and Micali's "next-bit test" (1982).
  2. Forward secrecy: if the internal state is compromised at time T, the adversary must not be able to compute outputs generated before T. This requires one-way functions in the generation chain.
  3. Backward secrecy: if the internal state is compromised at T, it must quickly become "unusable" through a reseeding mechanism drawing from new entropy sources.

Fortuna

Designed by Bruce Schneier and Niels Ferguson (authors of Cryptography Engineering), Fortuna is a reference CSPRNG design. It maintains 32 entropy pools, performs periodic reseedsings, and uses AES-256 in counter mode for generation. Its design allows it to absorb entropy from multiple sources simultaneously (interrupt timings, mouse movements, disk accesses) without ever being "drained" by an attacker who controls a single source. Fortuna underpins the random number generator in FreeBSD and macOS.

ChaCha20 and AES-CTR

Modern CSPRNG implementations are built on stream ciphers. ChaCha20, designed by Daniel J. Bernstein, is particularly favored: fast on processors without hardware AES instructions, resistant to timing side-channel attacks, and used in Linux since kernel 4.8 as the primary algorithm for its random number generator. AES-CTR (AES in counter mode) is used on platforms with hardware AES-NI (Intel, AMD, recent ARM) for optimal performance.

Standard APIs for developers

In practice, no developer should implement their own CSPRNG. Platforms provide standardized APIs:

  • JavaScript (browser and Node.js): crypto.getRandomValues(typedArray) — fills the array with cryptographically random bytes. In Node.js, crypto.randomBytes(n) or crypto.randomInt(min, max) are equivalent. These functions delegate to the operating system's underlying CSPRNG.
  • Python: the secrets module (introduced in Python 3.6) exposes secrets.token_bytes(), secrets.token_hex(), secrets.randbelow(n) — all based on os.urandom().
  • Linux / Unix: /dev/urandom provides a stream of cryptographically secure pseudo-random bytes from the kernel's entropy pool. /dev/random was the historical "blocking" interface — it would block when estimated entropy was low — but since Linux 5.6, both are equivalent.
  • Windows: BCryptGenRandom() (Windows CNG) or the older CryptGenRandom() — both FIPS 140-2 certified.

The golden rule: for any cryptographic use, use exclusively the platform's CSPRNG APIs. Never attempt to "fix" an ordinary PRNG by seeding it differently, or apply hash functions to Math.random() output to "secure" it.

True randomness: hardware entropy sources

Even the best CSPRNGs are, ultimately, deterministic algorithms seeded with an entropy seed. The quality of that seed is therefore critical. Where does real entropy come from in a computing system?

Classical hardware entropy sources

Modern operating systems collect entropy from multiple physical sources:

  • Thermal noise: electronic fluctuations caused by thermal agitation of electrons in resistors and transistors produce Johnson-Nyquist noise — measurable and genuinely unpredictable.
  • Hardware interrupt timings: the nanosecond-level variations between two keyboard, mouse, network, or disk interrupts are difficult to predict and constitute a valuable entropy source. Linux collects these via its entropy pool.
  • Voltage and current fluctuations: variations in power supply, CPU cycles, and memory accesses introduce measurable unpredictability at the hardware level.
  • Dedicated hardware accelerators: since Sandy Bridge (2011), Intel processors include the RDRAND instruction, which directly accesses a hardware random number generator (HRNG) based on thermal noise in the circuits. AMD followed with RDSEED/RDRAND on Ryzen and EPYC processors.

The Intel RDRAND controversy

Intel's RDRAND instruction generated notable controversy in the security community. In 2013, following Snowden's revelations about the NSA, questions were raised about whether the instruction could be backdoored — specifically, whether the NSA might have introduced a bias or trapdoor into Intel's hardware generator. The practical response from Linus Torvalds and the Linux kernel team was pragmatic: RDRAND is used as one entropy source among several in the pool, never as the sole source. The Linux kernel XORs RDRAND output with other entropy sources, so a backdoor in RDRAND would not compromise overall security as long as the other sources remain uncompromised. The general lesson: never trust a single entropy source, however hardware-based it may be.

Quantum randomness

Quantum mechanics is, by nature, fundamentally non-deterministic. Certain quantum phenomena are therefore true randomness sources in the physical sense:

  • Radioactive decay: the exact moment of decay of a radioactive nucleus is unpredictable in principle, governed only by quantum probability. Commercial generators (such as those from ID Quantique) exploit this phenomenon.
  • Photons and beam splitters: a photon hitting a semi-reflective beam splitter has exactly a 50% chance of being transmitted or reflected — a purely quantum decision. This is the basis of the most common QRNGs (Quantum Random Number Generators).
  • Quantum vacuum fluctuations: even in "empty" space, the energy fluctuations predicted by quantum electrodynamics produce measurable noise at the nanometer scale, exploitable as an entropy source.

ANU QRNG: the Australian National University publishes a truly random number generation service (quantum.anu.edu.au) based on measuring quantum fluctuations of the electromagnetic vacuum. It exposes a public REST API, long used to feed applications requiring genuine randomness.

Cloudflare LavaRand: Cloudflare deployed a spectacular setup in their San Francisco office — a row of lava lamps continuously filmed by cameras. The motion of the liquid is chaotic, unpredictable, and captured in real time to feed their global entropy pool. This is not Cloudflare's only entropy source (they also use double-pendulum systems and a radioactive generator), but LavaRand has become symbolic of the creativity possible in entropy collection.

When to use what: a practical decision guide

The choice between PRNG, CSPRNG, and true randomness depends entirely on context. Using a CSPRNG where a PRNG would suffice wastes resources. Using a PRNG where a CSPRNG is required is a security vulnerability. Here is the decision grid.

Cases where a standard PRNG is sufficient

  • Video games and simulations: procedural terrain generation, adversarial AI, random visual effects. Reproducibility (same seed → same map) is often a feature, not a flaw. The Mersenne Twister is the standard choice.
  • Monte Carlo simulations: numerical computation, financial modeling, physical simulations. Statistical quality matters more than unpredictability. Look for PRNGs with good equidistribution properties (such as PCG64 or SFC64, used by NumPy since version 1.17).
  • Unit tests with a fixed seed: generating reproducible test data. Ideally document the seed used.
  • Non-sensitive permutation and shuffling: a Fisher-Yates shuffle for a local music playlist.

Cases where a CSPRNG is mandatory

  • Cryptographic key generation (AES, RSA, ECC): any predictability destroys the security of the entire system. Use exclusively crypto.getRandomValues(), os.urandom(), or BCryptGenRandom() depending on the platform.
  • Authentication tokens, sessions, security cookies: a predictable token allows an attacker to impersonate a user. RFC 4086 (IETF) explicitly recommends CSPRNG usage for nonce and session identifier generation.
  • Cryptographic nonces (IVs for AES-CBC/GCM, nonces for ChaCha20-Poly1305): reusing a nonce with the same key is often catastrophic. For AES-GCM, a nonce collision lets an attacker recover the authentication key. A CSPRNG ensures negligible collision probability.
  • Password generation: our password generator uses crypto.getRandomValues() precisely for this reason.
  • QR codes containing sensitive data: if the encoded content of a QR code includes a secret token or a unique access URL, the token generation must use a CSPRNG.
  • Draws and lotteries with economic or legal value.

Cases where true randomness adds value

  • Quantum key distribution (QKD) protocols: security relies on the laws of physics, not computational complexity. True quantum randomness is required.
  • Certified national lotteries: some countries require a certified hardware generator for official draws.
  • Entropy seeding in virtualized environments: virtual machines can suffer from entropy starvation at boot time. Services like ID Quantique QRNGs or VirtIO-RNG are used to supplement their pool.

Statistical tests for randomness

How do you evaluate the quality of a random generator? Several statistical test batteries have been developed for this purpose.

The Diehard battery

Developed by George Marsaglia in the 1990s, Diehard is one of the first systematic statistical test batteries for PRNGs. It includes bit frequency tests, overlapping sequence tests, permutation tests, and more. A PRNG that fails Diehard is clearly defective. One that passes is statistically sound — but that says nothing about its cryptographic security.

NIST SP 800-22

The NIST SP 800-22 test suite (National Institute of Standards and Technology, Special Publication 800-22) is the reference standard for evaluating generators used in cryptographic contexts. It includes 15 statistical tests: monobit frequency, block frequency, runs, longest run, binary matrix rank, discrete Fourier transform, non-overlapping and overlapping template matching, Maurer's universal statistical, approximate entropy, cumulative sums, random excursions. Passing NIST SP 800-22 is necessary but not sufficient for FIPS 140-3 certification.

TestU01

Developed by Pierre L'Ecuyer and Richard Simard (University of Montreal), TestU01 is considered the most rigorous test battery currently available. It includes the "BigCrush" suite (approximately 100 intensive statistical tests). The Mersenne Twister fails certain BigCrush tests — confirming that even the best classical PRNGs are not statistically perfect in every dimension.

The ent utility

ent (Fourmilab) is a lightweight utility that computes Shannon entropy, chi-square, arithmetic mean, Monte Carlo π estimation, and serial correlation for a data stream. Useful for a quick check, but far less rigorous than NIST SP 800-22 or TestU01.

What "passing the tests" means — and doesn't mean

Passing a statistical test battery means the generator's output resembles uniform randomness along the tested dimensions. It does not prove cryptographic unpredictability. A generator that fully exposes its internal state in its outputs can pass all statistical tests while being trivially predictable. Statistical tests evaluate distributional quality; cryptographic security evaluates computational unpredictability. These are orthogonal properties.

Classic programming pitfalls

Misusing random generators is one of the most frequent sources of cryptographic vulnerabilities. Here are the most common mistakes, with their corrections.

Seeding with time() — the guessable seed

A classic error in C/C++ and old PHP scripts is initializing a PRNG with the system clock value:

// DANGEROUS — predictable seed
srand(time(NULL));
int token = rand();

// PHP — equally problematic
mt_srand(time());
$token = mt_rand();

The problem: time() has one-second resolution. An attacker who knows approximately when the token was generated needs to test only a small number of seeds (a few thousand if they know the time to the minute). For authentication tokens or password reset links, this is an exploitable vulnerability in practice. Many "predictable password reset link" CVEs are direct consequences of this mistake.

Modulo bias

To generate a random number in the range [0, n), the obvious approach is the modulo operator:

// BIASED if RAND_MAX+1 is not a multiple of n
int random_in_range = rand() % n;

The problem: if RAND_MAX + 1 is not divisible by n, values in [0, (RAND_MAX + 1) % n) appear with slightly higher probability. For n = 3 and RAND_MAX = 32767, values 0 and 1 appear 10923 times versus 10922 for 2 — a bias of 0.009%. Negligible in a game, but in cryptography this bias can be exploited through cumulative statistical attacks.

The fix is rejection sampling (discarding values in the biased zone):

// JavaScript — unbiased with crypto.getRandomValues()
function randomInt(min, max) {
    const range = max - min;
    const bytesNeeded = Math.ceil(Math.log2(range) / 8) + 1;
    const maxValid = Math.floor(256 ** bytesNeeded / range) * range;
    let value;
    do {
        const buf = new Uint8Array(bytesNeeded);
        crypto.getRandomValues(buf);
        value = buf.reduce((acc, b) => acc * 256 + b, 0);
    } while (value >= maxValid);
    return min + (value % range);
}

// Python — secrets.randbelow() implements rejection automatically
import secrets
result = secrets.randbelow(n)  # [0, n) with no bias

Nonce reuse in cryptography

A nonce ("number used once") must be used exactly once per key and per context. In AES-GCM, reusing a nonce with the same key is catastrophic: an attacker observing two ciphertexts encrypted with the same nonce can derive the XOR of the two plaintexts, and with a bit more data, recover the GCM authentication key. This type of attack compromised the Sony PlayStation 3 in 2010 (ECDSA nonce reuse allowed derivation of the private key). The rule: generate nonces with a CSPRNG, and when in doubt, use AES-GCM-SIV or ChaCha20-Poly1305 with synthetic nonces (derived from context rather than generated randomly).

Omitting characters from an alphabet

When generating a password, a naive implementation may construct an incomplete character set:

// Note: this alphabet omits certain special characters
// The entropy per character is log2(62) ≈ 5.95 bits
const chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
// Adding "!@#$%^&*()" (10 chars) gives log2(72) ≈ 6.17 bits
// A gain of 0.22 bits per character, roughly 3.5 bits over 16 chars — not negligible

Our password generator addresses this by letting you enable or disable each character category, with an unbiased selection method applied throughout.

Randomness and legal compliance

Official draws and lotteries

Legal requirements for draws and lotteries vary significantly by jurisdiction. In the United States, state lottery commissions typically require certified hardware random number generators for official draws, with independent auditing of the generation process. In the European Union, national gambling authorities impose similar requirements, often specifying FIPS 140-2 or higher certification for the RNG. For commercial promotions with prizes in the UK, the ASA and CAP Code require transparent, auditable random selection that can be verified after the fact. In all jurisdictions, the general principle is the same: the random source must be a CSPRNG or HRNG, the seed and outputs must be logged, and an independent party must be able to verify the process. Our random number generator is appropriate for informal draws; for legally binding lotteries, consult a certified auditing service.

GDPR and pseudonymization

The GDPR recognizes pseudonymization and anonymization as risk-reduction techniques. However, the Article 29 Working Party (now EDPB) clarified in Opinion 05/2014 that pseudonymization through simple deterministic hashing does not constitute anonymization: if the input space is small (social security numbers, dates of birth), the hash can be reversed by brute force. Adding a cryptographically strong random salt (generated by a CSPRNG) transforms pseudonymization into an operation that is irreversible without the key. In practice: always use a CSPRNG to generate hashing salts in systems processing personal data.

Algorithmic fairness and machine learning

In machine learning, the quality of randomness comes into play at multiple levels: initialization of neural network weights (Xavier/He initialization), shuffling of training data, sampling (bootstrap, bagging). Poorly initialized or biased PRNGs can introduce artificial correlations in training data and affect prediction fairness. Reproducibility requires a documented, fixed seed; robustness requires testing with multiple different seeds to ensure results are not seed-dependent.

Practical testing: the ent tool

To test the quality of your system's random stream, you can use the ent tool:

# Linux/macOS — test 1 MB of /dev/urandom
dd if=/dev/urandom bs=1M count=1 | ent

# Expected output for a good source:
# Entropy = 7.999982 bits per byte.
# Chi-square: 253.43 (> 10%: good)
# Arithmetic mean: 127.51 (ideal: 127.5)
# Monte Carlo Pi: 3.14159... (close to π)
# Serial correlation: 0.000012 (close to 0)

An entropy result close to 8 bits per byte indicates a near-uniform distribution — a sign of a good generator. As explained earlier, this result does not prove cryptographic security, only statistical quality.

The future: quantum randomness goes mainstream

The democratization of true quantum randomness is underway. Several convergent trends are making QRNG accessible to a wide audience of developers.

QRNG APIs for developers

The ANU Quantum Random Numbers Server (quantum.anu.edu.au) exposes a public REST API returning truly random numbers generated from quantum vacuum fluctuations. ID Quantique, a Swiss pioneer in the field, offers PCIe and USB hardware generators as well as a cloud API (Quantis). These services allow developers to feed a CSPRNG with certified quantum entropy — without requiring local quantum hardware.

Integration into browsers and operating systems

The W3C WebCrypto API specification, which exposes crypto.getRandomValues(), already relies on the system's hardware entropy sources (RDRAND, RDSEED). The expected evolution is better traceability of entropy quality — allowing applications to know whether their generator is drawing from thermal noise, a QRNG, or only software timings.

Post-quantum cryptography and randomness quality

The post-quantum algorithms standardized by NIST in 2024 (ML-KEM/Kyber, ML-DSA/Dilithium, SLH-DSA/SPHINCS+) impose specific requirements on the quality of randomness used for key generation. ML-KEM, for example, requires noise vectors generated from centered binomial distributions drawn from random bytes — an operation where an incorrect implementation (biased source bytes) can compromise the scheme's security. Post-quantum cryptography amplifies the consequences of poor randomness quality, making understanding these mechanisms even more critical for the years ahead.

Summary: choosing the right generator

Randomness in computing is not a binary subject. There is a continuum from fast but predictable PRNGs to physically non-deterministic quantum generators, with CSPRNGs providing computational guarantees sufficient for the vast majority of cryptographic applications.

The fundamental rule: use the level of randomness appropriate to the task. For games and simulations, a good PRNG like PCG64 or the Mersenne Twister is more than sufficient. For anything security-related — passwords, keys, tokens, nonces — use the platform's native CSPRNG, which combines hardware entropy with cryptographically proven algorithms. True quantum randomness provides additional guarantees useful for certain advanced protocols, but is rarely necessary in practice for application security.

What you must never do: use Math.random() to generate secrets, seed a PRNG with time(), or apply naive modulo to a generator's output. These mistakes, documented in hundreds of CVEs, have real consequences for system security.

To put these concepts into practice, our random number generator relies on crypto.getRandomValues(), guarantees the absence of modulo bias, and can be used for informal draws. For secure password generation based on the same CSPRNG principles, see our password generator. And if you need to encode or transmit random tokens safely, our Base64 encoder/decoder handles the encoding step.

Frequently asked questions

Why is Math.random() unsuitable for generating a secret key?

Math.random() is a non-cryptographic PRNG: its internal state (often 64 bits in V8) can be reconstructed by observing a handful of consecutive outputs. An attacker who observes a few generated values can predict all future ones — and reconstruct past values too. For any secret key or security token, use crypto.getRandomValues() exclusively, which delegates to the operating system's CSPRNG.

How do you generate a random number in a range without modulo bias?

The operation rand() % n introduces a bias whenever the generator's range is not a multiple of n. The fix is rejection sampling: discard values in the biased zone and draw again until you land in the unbiased region. In JavaScript, loop over crypto.getRandomValues() with a rejection check. In Python, secrets.randbelow(n) implements this correction automatically and is the recommended approach.

What is the difference between rolling a physical die and using a software generator?

A physical die draws its randomness from chaotic Newtonian physics: initial velocity, angle, surface friction — unpredictable in practice. A software PRNG is entirely deterministic: the same seed always yields the same sequence. A CSPRNG is computationally indistinguishable from true randomness for any adversary with bounded resources, but remains deterministic from its internal state. Only a hardware generator (thermal noise, quantum) is fundamentally non-deterministic.

Is quantum randomness truly more random than a well-implemented CSPRNG?

In practical cryptography, a well-seeded CSPRNG is computationally indistinguishable from quantum randomness for any polynomial-time adversary. Quantum randomness provides a stronger philosophical guarantee (fundamental physical indeterminism) and is useful for seeding CSPRNGs or for QKD protocols. For the vast majority of applications, the difference between a quality CSPRNG and a QRNG is purely theoretical.

How can you run a legally recognized random draw or lottery?

Requirements vary by jurisdiction. Generally, a draw with legal or monetary value requires a certified auditable tool with full logging (seed, algorithm, results, timestamp) and an independent auditor or notary to verify the process. The random source must be a CSPRNG or hardware RNG. Our random number generator is appropriate for informal draws; for legally binding lotteries, consult a certified auditing service or legal professional in your jurisdiction.