How random is random?
And why should you care?
Questions
In my first Computer Science seminar, I was tasked to create a simple guessing game in python, making use of random numbers. I got as far as:
import random
x = random.randint(1, 10)
print(x)At which point, I started to wonder how exactly the script was deciding which number to give me. Programs are deterministic, aren’t they? A quick Google search will sort this out, I thought. Yet, after two hours, the wikipedia rabbit hole was generating more open tabs than answers. As the seminar ended, and the emptying of the room forced me unceremoniously from my daze, I managed to come to one unsatisfying conclusion:
Generating random numbers is complicated.
Which, for an “efficient” (read: lazy) student like me, the prospect of understanding something so complex begs a much more important question:
Why should I care how random numbers are generated?
The answer to that, at least, is clear. Random numbers underpin modern cryptography. Every encrypted message, private key, and secure channel depends on randomness at its foundation. Without randomness, using an algorithm to encrypt your private data is about as worthwhile as swapping your bike’s D-lock for a sign that says “stealing is against the law”. In central London.
I am assuming here that you do not want your bike or your data stolen, but this isn’t a philosophy essay, so I’m allowed to do that.
Unpredictability
“The root of trust in cryptographic systems lies in the unpredictability of the random values used in key generation.”
If the starting value (the seed) of a given algorithm is predictable, an attacker can reconstruct any output generated from it; secret keys, private data, etc. The assumption of unpredictability is critical. Great, let’s have some of these unpredictable numbers then, job done.
Wrong. Job not done. Job not even nearly done. In actuality, all of our critical data is protected by numbers that are, in principle, predictable, with quite some variation in how difficult it is to do so. In the worst case, this could take an attacker seconds. In the best case, this could still take an attacker seconds; but enough to make up several trillion years.
You may be wondering where your system lands on this scale. If so, you may be relieved to read that you can measure how bad of a job your system’s Random Number Generator (RNG) is doing:
Within strict bounds
If you’re running Linux
On open-source hardware
with a significant investment of your time, effort, and money. Does that sound good? No, probably not.
Is there a another option? Yes; place your trust in historically untrustworthy hardware vendors, under pressure from intelligence organisations run by governments with unintelligible intentions and morally questionable methods (there’s a lot more to this, but that’s as much as I could throw into one sentence).
If I have been at all clear, I hope that by this point we can agree that:
We need random (i.e. unpredictable) numbers for our security systems to work.
Not all RNGs produce equally unpredictable output.
An encryption algorithm will, at best, not make that output any worse.
Entropy
“Entropy is the quantitative measure of uncertainty in a system. For a random variable, high entropy means less predictability; low entropy implies structure or bias that an adversary can exploit.”
If algorithms can’t give us true randomness, where do we turn? This is where True Random Number Generators (TRNGs) come in. Unlike pseudo-random (i.e. not random unless given random input) algorithms, TRNGs rely on physical processes. Such processes are well understood in physics as being unpredictable; messy, noisy, chaotic phenomena. Examples include thermal fluctuations (from thermodynamics, a notably dangerous field), avalanche noise (very spooky and confusing quantum stuff), and electrical instability in hardware (not just wonky wiring).
For the purposes of encryption, predicting outcomes of these processes is infeasible - the resources required are well above anyone’s computational limits. Through measuring, or, sampling, these processes, a TRNG provides the entropy (the raw unpredictability) that cyber security systems depend on.
Entropy, in this context, is simply a measure of uncertainty. A coin flip with a perfectly balanced coin has one bit of entropy. A deterministic algorithm, by contrast, has zero. In cryptography, we want every bit of a key to be as close as possible to that ideal “unpredictable” state. Why not just use TRNG output? We need a lot of unpredictable numbers. Measuring physical processes and turning the results into bits is a lot slower (or expensive) than mathematically expanding given bits into more bits. This is why we use to TRNGs provide the unpredictable seeds (initial random input) that algorithms can then expand into streams of random-looking bits.
But no TRNG is flawless. They can fail, drift, or be tampered with. Standards like NIST SP 800-90B emphasise not just statistical testing of the output, but also continuous health checks and auditability. Physics and maths papers can show that a process generates unpredictable output. Institutions can provide rigorous standards for hardware implementation and measurement of those outputs. Open source communities can post schematics and documentation showing a product’s alignment to these criteria. But none of them can prove that the TRNG device you have in your hand has any relation to the claims made about it. But you might be able to prove those claims yourself, if you can tell what’s going on in the image below. It took me longer than I’d care to admit, but, being a student, I had nothing better to do.
Answers
So how random is random? From the concept of randomness, we must extract a quantifiable measure; entropy. From the theoretical entropy source, we must implement a physical source, bounding our measurements by the uncertainty inherent in real-world devices. Once we have a device, we must analyse its output to show that it functions within expected levels of unpredictability, so we can know that it can be relied on, and keep making sure that it remains reliable. After all of this, we need to decide how much we care about all of this. How much quality are willing to trade for quantity? What are our aims, how sensitive is the data we are trying to protect, and how likely are we to face cyber attacks?
When we say that our private data is protected, what we really mean is that we trust that somewhere, deep inside our machines, enough unpredictability has been captured by the TRNG to keep the system afloat, given the weighty assumption that every other part of the system built on top of the device functions correctly.
Determining where to place your trust is another question entirely; and in my estimation, this question is just as difficult to answer, and much less worth asking. So, that’s what I’ll be doing next.
Subscribe for regular updates on my confusion and despair.


