Part 2: Technical Overview

The Building Blocks

Randomness

Assigning beacon chain duties unpredictably is an important defence against some attacks.
The beacon chain maintains a RANDAO to accumulate randomness.
Duties such as proposing blocks, committee assignments, and sync committee participation are assigned based on the RANDAO, with a limited lookahead period.
Block proposers verifiably contribute randomness to the RANDAO via BLS signatures over the epoch number.
Validators are able to bias the RANDAO to a small extent but this is not significant problem in practice.

Introduction

An element of randomness is an important part of a permissionless blockchain protocol, both for security and for fairness.

A protocol that is fully predictable could work well in a benign environment. But we must assume that our protocols will come under attack, and predictability provides attackers with opportunities - just as the bad guys in crime thrillers often take advantage of their victims' predictable routines.

An attacker with advance knowledge of which validators will be active in different roles has a significant foothold for mounting an attack. For example, to selectively mount denial of service attacks against future proposers, or to bribe members of a particular committee, or to register especially advantageous validator numbers for themselves allowing them to take over a future committee, or simply to censor transactions.¹

To quote from a paper by Brown-Cohen et al,²

Intuitively, it is good for protocols to be unpredictable in the sense that miners do not learn that they are eligible to mine a block until shortly before it is due to be mined. Many attacks, such as double-spending, or selfish-mining, can become much more profitable if miners know in advance when they become eligible to mine.

Unpredictability, arising from randomness, is an excellent first line of defence against many attacks.

Unpredictability in Proof of Work comes from the process used to mine blocks. A block is valid only if it satisfies a certain condition, and the only way to satisfy that condition is through trial and error. Miners make a random guess, test it, and try again if it's not correct - this is the "work" in Proof of Work. Only if the guess is correct is the block valid and the miner gets to extend the chain. As I write, the difficulty of the Ethereum PoW chain is around 12.5 Peta hashes. That means that mining an Ethereum block requires $1.25 \times 10^{16}$ guesses on average. This is similar to the odds of rolling 21 dice until they all come up six on the same roll. It is fabulously unlikely, yet somewhere on the Ethereum network somebody manages to do it every 13 seconds or so. Since the process is uniform – nobody is better at guessing (rolling dice) than anyone else – it provides fairness. Every Giga hash per second is equivalent to every other Giga hash per second (although there are other sources of unfairness in Proof of Work). And since guessing is random it provides unpredictability, which mitigates the attacks mentioned above.

Randomness³ in Ethereum's Proof of Stake protocol is used to bring unpredictability to the selection of block proposers, and to the membership of the committees that attest to blocks and sign sync data.

In this section we will look at the way that randomness is introduced into the beacon chain, some of the ways in which it is used, and finally some of the issues with the current scheme.

The RANDAO

The beacon chain design has always used a RANDAO⁴ mechanism to provide its in-protocol randomness. A RANDAO is simply an accumulator that incrementally gathers randomness from contributors. So, with each block, the proposer mixes in a random contribution to the existing RANDAO value.

To unpack that a little, the beacon chain maintains a RANDAO value. Every block included in the chain contains a verifiable random value provided by the validator that proposed it, its randao_reveal. As each block is processed the beacon chain's RANDAO value is mixed with the randao_reveal from the block. Thus, over time, the RANDAO accumulates randomness from all the block proposers.

If $R_n$ is the RANDAO value after $n$ contributions, and $r_n$ is the $n$ th randao_reveal, then the following holds. Here we are mixing in the new contribution using the xor function, $\oplus$ . Alternatives might be to use a sum or a hash, but xor is simple and has useful properties.

R_n = r_n \oplus R_{n-1}

We can think of a RANDAO as being like a deck of cards that's passed round the table, each person shuffling it in turn: the deck gets repeatedly re-randomised. Even if one contributor's randomness is weak, the cumulative result has a high level of entropy.

We can imagine the RANDAO as a deck of cards that accumulates randomness over time as each participant shuffles the deck in turn.

Current and past RANDAO values are stored in the beacon state in the randao_mixes field. The current value is updated by process_randao with every block that the beacon chain processes. If there is no block in a slot then the RANDAO is not updated. In addition to the RANDAO's current value, EPOCHS_PER_HISTORICAL_VECTOR (minus one) past values of the RANDAO at the ends of epochs are also stored in the state. These can be used to recalculate past committee assignments, which allows historical attestations to be slashed even months later.

Source of randomness

With every block the proposer includes a field randao_reveal that is its contribution to be mixed in to the RANDAO.

This contribution needs to satisfy two properties: it should be unpredictable by any other node, yet it should be verifiable by all nodes.

"Verifiable" means that, although random (read pseudo-random), the RANDAO contribution value must not be arbitrary. The proposer must not be able to pick and choose its contribution, otherwise it will just choose a value that advantages itself in some way. There must be a single valid contribution that the proposer can make in any given block, and all the other nodes must be able to verify that contribution.

The old: hash onions

Early ideas for verifiable randomness had each validator pre-committing to a "hash onion". Before joining the beacon chain a validator would generate a random number. When registering its initial deposit the validator would include the result of repeatedly cryptographically hashing that number a large number (thousands) of times as a commitment. Then when proposing a block the randao_reveal would be the pre-image of that commitment: one layer would be "peeled off the onion". Since a cryptographic hash is not invertible, only the proposer could calculate this value, but it's easily verifiable by everyone. Then the reveal gets stored as the new commitment and so on. This scheme is viable, but has complexities and edge cases – for example if a proposer's block gets orphaned everybody (except the beacon chain) can now see the reveal – that make it clunky.

The new: BLS signatures

A natural alternative to the hash onion became available when we moved to using BLS signatures in the protocol. With the BLS scheme every validator already has a closely guarded random value: the secret key that it signs blocks and attestations with. As far as anyone knows the signatures produced are uniformly random. Therefore, BLS signatures generated by block proposers are an excellent source of the randomness that we need for updating the RANDAO.

Using signatures rather than the hash onion both simplifies the protocol and provides for multi-party (distributed) validators. The aggregation properties of BLS signatures allow signatures from multiple validators to be combined as a threshold signature so that they can effectively act as a single validator. This valuable property would be very difficult with the hash onion approach.

For these reasons we now use a BLS signature as the entropy contribution to the RANDAO, that is, the randao_reveal.

Where does the entropy come from?

Evidently the predominant source of randomness in the Ethereum 2 protocol is the secret keys of the validators. If every validator key is generated uniformly randomly and independently then each contributes 256 bits of entropy to the overall pool. However, keys are sometimes not independently generated⁵. EIP-2333 provides a way to derive multiple validator keys from a single entropy seed, and large stakers are likely to have done this. Thus, the total entropy from $N$ validator keys will be less than $N \times 256$ bits, but we don't know how much less.

Some other sources of entropy for the RANDAO are noted in EIP-4399.

Missed or orphaned block proposals directly affect the RANDAO's output. Network conditions, node faults, or maintenance downtime can all lead to missed block proposals that have a degree of randomness.
The total number of active validators in an epoch affects the selection of proposers which in turn affects participation in the RANDAO. Thus, deposits and exits (both voluntary and forced) contribute entropy.
A validator's effective balance affects its likelihood of being selected to propose a block. Thus, changes in effective balances (perhaps due to one or more validators being offline for a period of time) add entropy.

Updating the RANDAO

When a validator proposes a block, it includes a field randao_reveal which has BLSSignature type. This is the proposer's signature over the epoch number, using it's normal signing secret key.

The randao_reveal is computed by the proposer as follows, the privkey input being the validator's random secret key.

def get_epoch_signature(state: BeaconState, block: BeaconBlock, privkey: int) -> BLSSignature:
    domain = get_domain(state, DOMAIN_RANDAO, compute_epoch_at_slot(block.slot))
    signing_root = compute_signing_root(compute_epoch_at_slot(block.slot), domain)
    return bls.Sign(privkey, signing_root)

When a block is processed, the randao_reveal is mixed into the RANDAO like this:

def process_randao(state: BeaconState, body: BeaconBlockBody) -> None:
    epoch = get_current_epoch(state)
    # Verify RANDAO reveal
    proposer = state.validators[get_beacon_proposer_index(state)]
    signing_root = compute_signing_root(epoch, get_domain(state, DOMAIN_RANDAO))
    assert bls.Verify(proposer.pubkey, signing_root, body.randao_reveal)
    # Mix in RANDAO reveal
    mix = xor(get_randao_mix(state, epoch), hash(body.randao_reveal))
    state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR] = mix

Two things are going on in the processing of the randao_reveal signature.

First, the signature is verified using the proposer's public key before being mixed in. This means that the proposer has almost no choice about what it contributes to the RANDAO: it either contributes a single verifiable value – the correct signature over the epoch number – or it withholds its block and contributes nothing. (Equivalently, a block with an incorrect reveal is invalid.)

Second, the hash of the signature is mixed in to the beacon state's RANDAO using xor. The combination of using the epoch number as the signed quantity and using xor to mix it in leads to a subtle, albeit tiny, improvement in attack-resistance of the RANDAO.

What's really happening when the RANDAO is shuffled. The signature over the epoch number is the RANDAO reveal that the proposer includes in its block. This is hashed then mixed in to the existing RANDAO with an xor operation.

Justin Drake explains in his notes:

Using xor in process_randao is (slightly) more secure than using hash. To illustrate why, imagine an attacker can grind randomness in the current epoch such that two of his validators are the last proposers, in a different order, in two resulting samplings of the next epochs. The commutativity of xor makes those two samplings equivalent, hence reducing the attacker's grinding opportunity for the next epoch versus hash (which is not commutative). The strict security improvement may simplify the derivation of RANDAO security formal lower bounds.

We will see shortly that it can be advantageous to an attacker to have control of the last slots of an epoch. Justin's point is that, under the current scheme, the attacker having validators $V_0, V_1$ in the two last slots of an epoch is equivalent to it having $V_1, V_0$ with respect to the randao_reveals. This fractionally reduces an attackers choices when it comes to influencing the RANDAO. If we used hash rather than xor, or if we signed over the slot number rather than the epoch number, these orderings would result in different outcomes from each other, giving an attacker more choice and therefore more power.

Lookahead

We started this section with a discussion of unpredictability. Ideally, it should not be possible to predict the duties for any block proposer or committee member until the moment they become active. However, in practice, proposers and committee members need a little advance notice of their duties to allow them to join the right p2p network subnets and do whatever other preparation they need to do.

The RANDAO seed at the end of epoch $N$ is used to compute validator duties for the whole of epoch $N+2$ . This interval is controlled by MIN_SEED_LOOKAHEAD via the get_seed() function. Thus validators have at least one full epoch to prepare themselves for any duties, but no more than two.

Under normal circumstances, then, an attacker is not able to predict the duty assignments more than two epochs in advance. However, if an attacker has a large proportion of the stake or is, for example, able to mount a DoS attack against block proposers for a while, then it might be possible for the attacker to predict the output of the RANDAO further ahead than MIN_SEED_LOOKAHEAD would normally allow. The attacker might then use this foreknowledge to strategically exit validators or make deposits⁶ in order to gain control of a committee, or a large number of block proposal slots.

It's certainly not an easy attack. Nonetheless it's easy to defend against, so we might as well do so.

To prevent this, we assume a maximum feasible lookahead that an attacker might achieve, MAX_SEED_LOOKAHEAD and delay all activations and exits by this amount, which allows time for new randomness to come in via block proposals from honest validators, making irrelevant any manipulation by the entering or exiting validators. With MAX_SEED_LOOKAHEAD set to 4, if only 10% of validators are online and honest, then the chance that an attacker can succeed in forecasting the seed beyond (MAX_SEED_LOOKAHEAD - MIN_SEED_LOOKAHEAD) = 3 epochs is $0.9^{3\times 32}$ , which is about 1 in 25,000.

The RANDAO value at the end of epoch $N$ is used to set duties for epoch $N+2$ , which is controlled by MIN_SEED_LOOKAHEAD. A validator exiting in epoch $N+1$ remains active until at least the end of epoch $N+5$ (depending on the exit queue). This is controlled by MAX_SEED_LOOKAHEAD.

Single Secret Leader Election

As currently implemented, both the minimum and maximum lookaheads smell a little of engineering hackery. In a perfect design only the block proposer would know ahead of time that it has been chosen to propose in that slot. Once its block is revealed then the rest of the network would be able to verify that, yes, this was indeed the chosen proposer. This feature is called Single Secret Leader Election. We do not yet have it in the Ethereum protocol, and I shall write about it elsewhere. Meanwhile, some good progress is being made towards making it practical.

RANDAO biasability

The RANDAO value for an epoch is set at the end of the previous epoch, and duty assignments for the entire epoch (proposals and committee memberships) depend on that value. (Actually – due to MIN_SEED_LOOKAHEAD – on the RANDAO value at the end of the last-but-one epoch, but we'll overlook that in what follows.)

Future duty assignments for validators – block proposers, committee members, sync committee duty – are calculated based on the state of the RANDAO at the end of each epoch.

Thus, when a validator happens to be assigned to propose a block in the last slot of an epoch, it gains a small amount of control over the assignments for the next epoch. This is because it can choose to reveal its block, which mixes in its RANDAO reveal, or it can choose (at a cost) to withhold its block and keep the existing RANDAO value, knowing that there will be no subsequent RANDAO change before duties are calculated. In this way, a validator is able to exert a little influence over the proposer and committee assignments in the next epoch. This is called "one bit of influence" over the RANDAO as the validator has a choice of two outcomes.

The last proposer in an epoch has a choice. It can propose its block as usual, updating the RANDAO, resulting in a set of duty assignments $A$ . Or it can withhold its block, leaving the RANDAO as-is, resulting in a set of duty assignments $B$ . If outcome $B$ gives the owner of the validator sufficient advantage to compensate for having missed a proposal, then it is an opportunity to "cheat".

If an attacker gets a string of proposals at the end of an epoch then it has more power. Having $k$ consecutive proposals at the end of an epoch gives the attacker $2^k$ choices for the ultimate value of the RANDAO that will be used to compute future validator duties. In this scenario the attacker has " $k$ bits of influence" over the RANDAO.

Biasability analyses

This section is fully optional. I got a bit carried away with the maths and it's fine to skip to the next section.

To make discussion of RANDAO biasability more concrete I shall try to quantify what it means in practice with a couple of examples. In each case the entity "cheating" or "attacking" has control over a proportion of the stake $r$ , either directly or through some sort of collusion, and we will assume that the remaining validators are all acting independently and correctly. We will also assume, of course, that individual randao_reveals are uniformly random.

In the first example, I will try to gain control of the RANDAO by permanently acquiring proposals in the last slots of an epoch. In the second example I will try to improve my expected number of block proposals by biasing the RANDAO when I get the opportunity to do so. In both cases I will be selectively making and withholding proposals having computed the best outcome: a process of "grinding" the RANDAO.

These examples are intended only as illustrations. They are not academic studies, and there are lots of loose ends. It's very likely I've messed something up: probability is hard. I'd be very interested if anyone wanted to make them more rigorous and complete. Some related work, more simulation based, was previously done by Runtime Verification.

RANDAO takeover

If I control a proportion $r$ of the total stake, how much can I boost my influence over the protocol by manipulating the RANDAO?

The ability to influence the RANDAO depends on controlling a consecutive string of block proposals at the end of an epoch. We shall call this property "having a tail", and the tail will have a length $k$ from 0 to a maximum of 32, an entire epoch.

Our question can be framed like this: if I have a tail of length $k$ in one epoch, what is my expected length of tail in the next epoch? With a tail of length $k$ I have $2^k$ opportunities to reshuffle the RANDAO by selectively making or withholding block proposals. Can I grind through the possibilities to increase my tail length next time, and eventually take over the whole epoch?

In the absence of any manipulation, my probability of having a tail of length exactly $k$ in any given epoch is $(1-r)r^k$ for $k < 32$ , and $r^{32}$ when $k = 32$ . This is the chance that I make $k$ proposals in the tail positions preceded by a proposal that I did not make.

q_k = \begin{cases} (1-r)r^k & 0 \leq k < 32 \\ r^k & k = 32 \end{cases}

So the expected tail length for someone controlling a proportion $r$ of the stake is,

E(r) = \sum_{n=1}^{32} n q_n = \sum_{n=1}^{31} n (1-r) r^n + 32 r^{32}

The bottom axis is $r$ , and the side axis is my expected proposals tail length $E(r)$ assuming no RANDAO manipulation.

Now we will calculate $E^{(k)}(r)$ , the expected length of tail I can achieve in the next epoch by using my previous tail of length $k$ to grind the options.

Consider the case where I have a tail of length $k = 1$ in some epoch. This gives me two options: I can publish my RANDAO contribution or I can withhold my RANDAO contribution (by withholding my block). My strategy is to choose the longest tail for the next epoch that I can gain via either of these options.

The probability, $p^{(1)}_j$ , of gaining a tail of exactly length $j$ as a result of having a tail of length 1 is,

p^{(1)}_j = 2\sum_{i=0}^{j-1}q_{j}q_{i} + q_{j}q_{j} = q_j \left( 2\sum_{i=0}^{j-1}q_i + q_j \right)

We can think about this as follows. With $k = 1$ we get two attempts, therefore $q$ appears twice in each product. To calculate $p^{(1)}_j$ we need the sum over the all the combinations of the probability of getting a tail of length exactly $j$ (that is, $q_j$ ) multiplied by the probability of getting a tail of $j$ or less (that is, not getting a tail longer than $j$ , otherwise we would have chosen that length instead of $j$ ).

Visually, calculating $p^{(1)}_2$ looks like the sum of the values in the shaded area of the next diagram.

The probability that we get a maximum tail length of exactly two with two attempts is the sum of the terms in the shaded areas. Despite the overlap, each term is included only once.

This example with tail length $k = 1$ results in a two-dimensional square since we have two possibilities to try. One way to calculate $p^{(1)}_j$ is to take the difference between the sum of all the products in the square side $j + 1$ and the sum of all the products in the square side $j$ .

Thinking of it like this helps us to generalise to the cases when $k > 1$ . In those cases we are dealing with a hyper-cube of dimension $2^k$ ; each element is the product of $2^k$ values of $q$ . To calculate $p^{(k)}_j$ we can find the difference between the sum of all the products in the $2^k$ -dimensional cube side $j + 1$ and the sum of all the products in the $2^k$ -dimensional cube side $j$ . This is tedious to write down and involves a mind-boggling number of calculations even for quite small $k$ , but see my example code for an efficient a way to calculate it.

Now, finally, we can calculate the expected tail length in the next epoch given that we have a tail of length $k$ in this epoch.

E^{(k)}(r) = \sum_{n=1}^{32} n p^{(k)}_n

Graphing this for various values of $k$ we get the following. Note that the solid, $k = 0$ , line is the same as $E(r)$ above - the expected tail with no manipulation. That is, $E^{(0)}(r) = E(r)$ as you'd expect.

The bottom axis is $r$ , and the side axis is my subsequent expected proposals tail length, $E^{(k)}(r)$ given various values of tail length $k$ that I can play with. Note that $E^{(0)}(r) = E(r)$ from the graph above.

We see that, if I end up with any length of tail in an epoch, I can always grind my RANDAO contributions to improve my expected length of tail in the next epoch when compared with not grinding the RANDAO. And the longer the tail I have, the better the tail I can expect to have in the next epoch. These results are not surprising.

The important question is, under what circumstances can I use this ability in order to indefinitely increase my expected tail length, so that I can eventually gain full control of the RANDAO?

To investigate this, consider the following graph. Here, for each $k$ line we have plotted $E^{(k)}(r) - k$ . This allows us to see whether our expected tail in the next epoch is greater or less than our current tail. If $E^{(k)}(r) - k$ is negative then I can expect to have fewer proposals in the next epoch than I have in this one.

The bottom axis is $r$ , and the side axis is my subsequent expected proposals tail length minus my current tail length, $E^{(k)}(r) - k$ for various values of $k$ .

We can see that for $r$ less than around 0.5, especially as $k$ grows, we expect our tail length to shrink rather than grow, despite our best RANDAO grinding efforts. However, for $r$ greater than 0.5, we expect our tail length to grow as a result of our grinding, whatever tail length we start with.

For completeness, we shouldn't only look at expectations, but also at probabilities. The following graph shows the probability that if I have a tail of length $k$ then I will have a tail of length less than $k$ in the next epoch. As $k$ increases you can see that a step function is forming: for a proportion of stake less than about 50% it becomes practically certain that my tail will decrease in length from one epoch to the next despite my best efforts to grow it; conversely, for a proportion of stake greater than a little over 50% it becomes practically certain that I can maintain or grow my tail of block proposals.

The bottom axis is $r$ , and the side axis is the probability that my best tail length in the next epoch is less than my current tail length for various values of tail length $k$ .

Discussion of RANDAO takeover

What can we conclude from this? If I control less than about half the stake, then I cannot expect to be able to climb the ladder of increasing tail length: with high probability the length of tail I have will decrease rather than increase. Whereas, if I have more than half the stake, my expected length of tail increases each epoch, so I am likely to be able to eventually take over the RANDAO completely. With high enough $r$ , the $2^k$ options I have for grinding the RANDAO overwhelm the probability of losing tail proposals. For large values of $k$ it will not be practical to grind through all of these options, but we need to arrive at only one good combination in order to succeed so we might not need to do the full calculation.

The good news is that, if attackers control more than half the stake, they have more interesting attacks available, such as taking over the LMD fork choice rule. So we generally assume in the protocol that any attacker has less than half the stake, in which case the RANDAO takeover attack appears to be infeasible.

As a final observation, we have ignored cases where two or more of the tail proposals are from the same validator. As discussed above, these proposals would each result in the same RANDAO contribution and reduce my grinding options. However, with a large number of validators in the system this is a reasonable approximation to make.

Code for calculating the length of tail with cheating

Here is the code for generating the data for the graphs above. The length of tail goes up to $k = 12$ . Feel free to increase that, although it gets quite compute intensive. Twelve is enough to see the general picture.

def prob_tail_eq(r, k):
    return (1 - r) * r**k if k < N else r**k

# The sum of the products of all the q_i in the hypercube of side j and dim k
# Recursive is cooler, but written iteratively so that python doesn't run out of stack
def hyper(q, j, k):
    h = 1
    for n in range(1, k + 1):
        h = sum([q[i] * h for i in range(j)])
    return h

# Smoke test.
assert abs(hyper([0.9, 0.09, 0.009, 0.0009, 0.00009, 0.00001], 6, 32) - 1.0) < 1e-12

N    = 32 # The number of slots per epoch
KMAX = 12 # The maximum length of prior tail we will consider
NINT = 20 # The number of intervals of r between 0 and 1 to generate

expected = [[] for i in range(KMAX + 1)]
prob_dec = [[] for i in range(KMAX + 1)]
rs = [i / NINT for i in range(1, NINT)]
for r in rs:
    # q[j] = the probability of having a tail of exactly j in one attempt
    q = [prob_tail_eq(r, j) for j in range(N + 1)]
    for k in range(KMAX + 1):
        h = [hyper(q, j, 2**k) for j in range(N + 2)]
        # p[j] = the probability that with a tail of k I can achieve a tail of j in the next epoch
        p = [h[j + 1] - h[j] for j in range(N + 1)]
        # The expected length of tail in the next epoch given r and k
        expected[k].append(sum([j * p[j] for j in range(N + 1)]))
        # The probability of a decrease in tail length to < k
        prob_dec[k].append(h[k])
print(rs)
print(expected)
print(prob_dec)

Block proposals boost

For the second worked example I will try to improve the overall number of proposals that I get among my validators. Unlike in the first example, I will not be trying to maximise my advantage at any cost. I will only manipulate the RANDAO when I can do so without any net cost to myself.

Once again, I control a proportion $r$ of the stake. I will only be considering tails of length zero or of length one - going beyond that gets quite messy, and my intuition is that for values of $r$ less than a half or so it will make little difference.

Let $q_j$ be my probability of getting exactly $j$ proposals in an epoch without any manipulation of the RANDAO (different from the $q$ in the first example, but related):

q_j = r^j{(1-r)}^{32-j}{32 \choose j}

My expected number of proposals per epoch when acting honestly is simple to compute,

E = \sum_{n=1}^{32} n q_n = 32 r

Now I will try to bias the RANDAO to give myself more proposals whenever I have the last slot of an epoch, which will happen with probability $r$ . Doing this, my expected number of proposals in the next epoch is as follows. The prime is to show that I am trying to maximise my advantage (cheat), and the subscript is to show that we are looking one epoch ahead.

E'_1 = \sum_{n=1}^{32} n ((1 - r) q_n + r p_n)

Unpacking this, the first term in the addition is the probability, $1 - r$ , that I did not have the last slot in the previous epoch (so I cannot do any biasing) combined with the usual probability $q_n$ of having $n$ proposals in an epoch.

The second term is the probability, $r$ , that I did have the last slot in the previous epoch combined with the probability $p_n$ that I get either $n$ proposals by proposing my block, or $n + 1$ proposals by withholding my block. We need the plus one to make up for the block I would be withholding at the end of the previous epoch in order to get this outcome.

p_j = \begin{cases} \sum_{i=0}^{j} q_i (q_j + q_{j+1}) & 0 \leq j < 32 \\ \sum_{i=0}^{j} q_i q_j & j = 32 \end{cases}

As before, we can illustrate this by considering the matrix of probabilities. With a tail of one I have two choices: to propose or to withhold. To achieve a net number of exactly $j$ proposals we are looking for the combinations where either of the following holds.

Proposing gives me exactly $j$ proposals and withholding gives no more than $j+1$ (that is, $\sum_{i=0}^{j+1}q_{i}q_{j}$ ). These are the elements in the horizontal bar in the diagram below.
Proposing gives me no more than $j$ proposals and withholding gives me exactly $j + 1$ (that is, $\sum_{i=0}^{j}q_{j+1}q_i$ ).⁷ These are the elements in the vertical bar in the diagram below.

Note that the $q_{j+1}q_j$ element appears in both outcomes, but must be included only once.

The probability that we get a net number of exactly two proposals with two attempts is the sum of the terms in the shaded areas. Despite the overlap, each term is included only once.

We can iterate this epoch by epoch to calculate the maximum long-term improvement in my expected number of proposals. The probability that I gain the last slot of epoch $N$ is $E'_N / 32$ .

E'_{N+1} = \sum_{n=1}^{32} n \left((1 - \frac{E'_N}{32}) q_n + \frac{E'_N}{32} p_n\right)

The solid line is $E$ , the expected number of block proposals per epoch for a proportion of the stake that does not seek to bias the RANDAO. The dashed line is $E'$ , the long-term expected number of block proposals per epoch for a proportion of the stake that coordinates to bias the RANDAO in its favour.

The maximum percentage gain in block proposals that I can acquire is shown in the following graph.

The long-term percentage increase in the expected number of proposals per epoch that can be gained by a proportion of the stake coordinating to bias the RANDAO. An entity with 25% of the stake can gain an extra 2.99% of proposals (8.24 per epoch rather than exactly 8), assuming that the remaining stakers are uncoordinated.

Code for calculating the expected number of proposals with cheating

The following Python code calculates $E'_N$ to convergence.

def fac(n):
    return n * fac(n - 1) if n else 1

def choose(n, k):
    return fac(n) / fac(k) / fac(n - k)

def prob(n, k, r):
    return r**k * (1 - r)**(n - k) * choose(n, k)

nintervals = 20
for idx in range(1, nintervals + 1):
    r = r0 = idx / nintervals
    q = [prob(32, j, r0) for j in range(33)]

    p = []
    for j in range(33):
        p.append(sum([q[i] * q[j] + (q[j + 1] * q[i] if (j < 32) else 0) for i in range(j + 1)]))

    # Iterate to convergence
    e = 0
    while (e == 0 or abs(e - e_old) > 0.000001):
        e_old = e
        e = sum([i * (q[i] * (1 - r) + p[i] * r) for i in range(33)])
        r = e / 32

    print(r0, r0 * 32, e, 100 * (e / (r0 * 32) - 1))

Discussion of proposals boost

In the above analysis we considered only the effect of using the last slot of an epoch to bias the RANDAO and saw that an entity with any amount of stake can fractionally improve its overall expected number of block proposals, assuming that everyone else is acting honestly.

The expected gain may be higher if we consider using the two last slots, or the $k$ last slots, especially if combined with the previous tail-extension attack. But I expect that for $r$ less than a half or so any further improvement will be very small.

Verifiable delay functions

We've seen that, although the RANDAO is biasable, it is not so biasable as to break the protocol: for our purposes the randomness is "good enough".

Nonetheless, it is interesting to explore how it might be improved, especially as, with The Merge, the RANDAO contents will be available to Ethereum's smart contract layer. Randomness biasability in a large lottery contract, for example, could be more of a problem than biasability in the consensus layer.

The long-term fix for biasability is to use a verifiable delay function (VDF). A VDF is guaranteed to be slow to compute its output, but that output can be verified quickly. In practice the VDF is a calculation run on a specialised hardware device that is assumed to have a performance within a small factor of the theoretical maximum performance. So, a VDF might output a result in, say, 20 seconds with the assumption that the best that any other device could do is to obtain the result in, say, 5 seconds.

The idea is that RANDAO updates would come from the output of the VDF. A proposer would have to decide whether to commit its randao_reveal before it is possible for it to compute the actual contribution: the future output of the VDF. This eliminates any opportunistic biasing of the RANDAO.

Only one VDF needs to be active at any time on the network since it can publish its result for quick verification by all the other nodes.

Although a lot of work has been done on designing and specifying VDFs there is no active plan to implement one in Ethereum at this time.