Part 2: Technical Overview

The Incentive Layer

Inactivity leak

When the beacon chain is not finalising it enters a special "inactivity leak" mode.
Attesters receive no rewards. Non-participating validators receive increasingly large penalties based on their track records.
This is designed to restore finality in the event of the permanent failure of large numbers of validators.

Introduction

If the beacon chain hasn't finalised a checkpoint for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY (4) epochs, then it enters "inactivity leak" mode¹.

The inactivity leak is a kind of emergency state in which rewards and penalties are modified as follows.

Attesters receive no attestation rewards while attestation penalties are unchanged.
Any validators deemed inactive have their inactivity scores raised, leading to an additional inactivity penalty that potentially grows quadratically with time. This is the inactivity leak, sometimes known as the quadratic leak.
Proposer and sync committee rewards are unchanged.

The idea for the inactivity leak was proposed in the original Casper FFG paper. The problem it addresses is that of how to recover finality (liveness, in some sense) in the event that over one-third of validators goes offline. Finality requires a majority vote from validators representing 2/3 of the total stake.

The mechanism works as follows. When loss of finality is detected the inactivity leak gradually reduces the stakes of validators who are not making attestations until, eventually, the participating validators control 2/3 of the remaining stake. They can then begin to finalise checkpoints once again.

This inactivity penalty mechanism is designed to protect the chain long-term in the face of catastrophic events (sometimes referred to as the ability to survive World War III). The result might be that the beacon chain could permanently split into two independent chains either side of a network partition, and this is assumed to be a reasonable outcome for any problem that can't be fixed in a few weeks. In this sense, the beacon chain formally prioritises availability over consistency. (You can't have both.)

In any case, it provides a powerful incentive for stakers to fix any issues they have and to get back online.

The reason why no validators receive attestation rewards during an inactivity leak is once again due to the possibility of discouragement attacks. An attacker might deliberately drive the beacon chain into an inactivity leak, perhaps by a combination of censorship and denial of service attack on other validators. This would cause the non-participants to suffer the leak, while the attacker continues to attest normally. We need to increase the cost to the attacker in this scenario, which we do by not rewarding attestations at all during an inactivity leak.

As with penalties, the amounts subtracted from validators' beacon chain accounts due to the inactivity leak are effectively burned, reducing the overall net issuance of the beacon chain.

Mathematics

Let's study the effect of the leak on a single validator's balance, assuming that during the period of the inactivity leak (non-finalisation) the validator is completely offline.

At each epoch, the offline validator will be penalised an amount proportional to $tB / \alpha$ , where $t$ is the number of epochs since the chain last finalised, $B$ is the validator's effective balance, and $\alpha$ is the prevailing inactivity penalty quotient (currently INACTIVITY_PENALTY_QUOTIENT_BELLATRIX).

The effective balance $B$ will remain constant for a while, by design, during which time the total amount of the penalty after $t$ epochs would be $t(t+1)B / 2\alpha$ : the famous "quadratic leak". If $B$ were continuously variable, the penalty would satisfy $\frac{dB}{dt}=-\frac{tB}{\alpha}$ , which can be solved to give the exponential $B(t)=B_0e^{-t^2/2\alpha}$ . The actual behaviour is somewhere between these two (piecewise quadratic) since the effective balance is neither constant nor continuously variable but decreases in a step-wise fashion.

In the continuous approximation, the inactivity penalty quotient, $\alpha$ , is the square of the time it takes to reduce the balance of a non-participating validator to $1 / \sqrt{e}$ , or around 60.7% of its initial value. With the value of INACTIVITY_PENALTY_QUOTIENT_BELLATRIX at $2^{24}$ , this equates to 4096 epochs, or 18.2 days.

For Phase 0 of the beacon chain, the value of INACTIVITY_PENALTY_QUOTIENT was increased by a factor of four from $2^{24}$ to $2^{26}$ , so that validators would be penalised less severely if there were non-finalisation due to implementation problems in the early days. As it happens, there were no instances of non-finalisation during the whole eleven months of Phase 0 of the beacon chain.

The value was decreased by one quarter in the Altair upgrade from $2^{26}$ (INACTIVITY_PENALTY_QUOTIENT) to $3 \cdot 2^{24}$ (INACTIVITY_PENALTY_QUOTIENT_ALTAIR), and to its final value of $2^{24}$ (INACTIVITY_PENALTY_QUOTIENT_BELLATRIX) in the Bellatrix upgrade. Decreasing the inactivity penalty quotient speeds up recovery of finalisation in the event of an inactivity leak.

Inactivity scores

During Phase 0, the inactivity penalty was an increasing global amount applied to all validators that did not participate in an epoch, regardless of their individual track records of participation. So a validator that was able to participate for a significant fraction of the time could still be quite severely penalised due to the growth of the inactivity penalty. Vitalik gives a simplified example: "if fully [off]line validators get leaked and lose 40% of their balance, someone who has been trying hard to stay online and succeeds at 90% of their duties would still lose 4% of their balance. Arguably this is unfair." We found during the Medalla testnet incident that keeping a validator online when all around you is chaos is not easy. We don't want to punish stakers who are honestly doing their best.

To improve this, the Altair upgrade introduced individual validator inactivity scores that are stored in the state. Validators' scores are updated each epoch as follows.

At the end of epoch $N$ $N$ , irrespective of the inactivity leak,
- decrease a validator's score by one when it made a correct and timely target vote in epoch $N-1$ , and
- increase the validator's score by INACTIVITY_SCORE_BIAS (four) otherwise.
When not in an inactivity leak,
- decrease every validator's score by INACTIVITY_SCORE_RECOVERY_RATE (sixteen).

Graphically, the flow-chart looks like this.

How each validator's inactivity score is updated. The happy flow is right through the middle. "Active", when updating the scores at the end of epoch $N$ , means having made a correct and timely target vote in epoch $N-1$ .

Note that there is a floor of zero on the score.

When not in an inactivity leak validators' inactivity scores are reduced by INACTIVITY_SCORE_RECOVERY_RATE + 1 per epoch when they make a timely target vote, and by INACTIVITY_SCORE_RECOVERY_RATE - INACTIVITY_SCORE_BIAS when they don't. So, even for non-performing validators, scores decrease outside a leak.

When in a leak, if $p$ is the participation rate between $0$ and $1$ , and $\lambda$ is INACTIVITY_SCORE_BIAS, then the expected score after $N$ epochs is $\max (0, N((1-p)\lambda - p))$ . For $\lambda = 4$ this is $\max (0, N(4 - 5p))$ . So a validator that is participating 80% of the time or more can maintain a score that is bounded near zero. With less than 80% average participation, its score will increase unboundedly.

This is nice because, if many validators are able to participate intermittently, it indicates that whatever event has befallen the chain is potentially recoverable, unlike a permanent network partition, or a super-majority network fork, for example. The inactivity leak is intended to bring finality to irrecoverable situations, so prolonging the time to finality if it's recoverable is likely a good thing.

The following graph illustrates some scenarios. We have an inactivity leak that starts at zero, and ends after 100 epochs, after which finality is recovered and we are no longer in the leak. There are five validators. Working up from the lowest line, they are:

Always online: correctly registering a timely target vote in every epoch. The inactivity score remains at zero.
90% online: the inactivity score remains bounded near zero. From the analysis above, it is expected that anything better than 80% online will bound the score near zero.
70% online: the inactivity score grows slowly over time.
Generally online, but offline between epochs 50 and 75: the inactivity score is zero during the initial online period; grows linearly and fairly rapidly while offline during the leak; declines slowly when back online during the leak; and declines rapidly once the leak is over.
Always offline: the inactivity score increases rapidly during the leak, and declines even more rapidly once the leak is over.

The inactivity scores of five different validator personas in an inactivity leak that starts at zero and ends at epoch 100 (labelled "End" and shown with a dashed line). The dotted lines labelled "A" and "B" mark the start and end of the offline period for the fourth validator.

Inactivity penalties

The inactivity penalty is applied to all validators at every epoch based on their individual inactivity scores, irrespective of whether a leak is in progress or not. When there is no leak, the scores return to zero (rapidly for active validators, less rapidly for inactive ones), so most of the time this is a no-op.

The penalty for validator $i$ is calculated as

\begin{split} s_{i}B_{i} / (\tt{INACTIVITY\_SCORE\_BIAS} \times \tt{INACTIVITY\_PENALTY\_QUOTIENT\_BELLATRIX}) \\ = \frac{s_{i}B_{i}}{4 \times 16{,}777{,}216} \end{split}

where $s_i$ is the validator's inactivity score, and $B_i$ is the validator's effective balance.

This penalty is applied at each epoch, so (for constant $B_i$ ) the total penalty is proportional to the area under the curve of the inactivity score, above. With the same five validator persona's we can quantify the penalties in the following graph.

Always online: no penalty due to the leak.
90% online: negligible penalty due to the leak.
70% online: the total penalty grows quadratically but slowly during the leak, and rapidly stops after the leak ends.
Generally online, but offline between epochs 50 and 75: a growing penalty during the leak, that rapidly stops when the leak ends.
Always offline: we can clearly see the quadratic nature of the penalty in the initial parabolic shape of the curve. After the end of the leak it takes around 35 epochs for the penalties to return to zero.

The balance retained by each of the five validator personas after the inactivity leak penalty has been applied. The scenario is identical to the chart above.

We can see that the new scoring system means that some validators will continue to be penalised due to the leak even after finalisation starts again. This is intentional. When the leak causes the beacon chain to finalise, at that point we have just two-thirds of the stake online. If we immediately stop the leak (as we used to), then the amount of stake online would remain close to two-thirds and the chain would be vulnerable to flipping in and out of finality as small numbers of validators come and go. We saw this behaviour on some of the testnets prior to launch. Continuing the leak after finalisation serves to increase the balances of participating validators to greater than two-thirds, providing a buffer that should mitigate such behaviour.

Ejection

It is not necessary for non-participating validators to be ejected from the active validator set in order for the inactivity leak to be effective at regaining finality. Reducing the proportion of the total stake held by those non-participating validators is sufficient.

Nonetheless, a validator will be exited when its effective balance drops to EJECTION_BALANCE. This is taken care of in the end of epoch registry updates. Note that, due to the way that effective balance is calculated, the ejection will happen when the actual balance drops below 16.75 ETH.

We can simulate how long it would take for a completely offline validator to be ejected due solely to the inactivity leak. It will be slightly sooner in reality due to the additional penalties for missing attestations.

For a validator starting the leak period with an actual balance of 32 ETH, the simulation shows that it would take 4686 epochs (almost 3 weeks) for it to be ejected. We can also take this as a rough upper-bound on how long it would take the beacon chain to recover finality, however many validators went offline².

Ejection simulation code

GWEI = 10 ** 9
EJECTION_BALANCE = 16 * GWEI
MAX_EFFECTIVE_BALANCE = 32 * GWEI
HYSTERESIS_QUOTIENT = 4
INACTIVITY_SCORE_BIAS = 4
INACTIVITY_PENALTY_QUOTIENT = 2 ** 24
# Simplified hysteresis for monotonically decreasing balance
def calc_effective_balance(balance):
    return min(MAX_EFFECTIVE_BALANCE, (balance + GWEI // HYSTERESIS_QUOTIENT) // GWEI * GWEI)

epoch = 0
score = 0
balance = 32 * GWEI
effective_balance = calc_effective_balance(balance)
while effective_balance > EJECTION_BALANCE:
    balance -= effective_balance * score // (INACTIVITY_SCORE_BIAS * INACTIVITY_PENALTY_QUOTIENT)
    effective_balance = calc_effective_balance(balance)
    score += INACTIVITY_SCORE_BIAS
    epoch += 1
print(balance / GWEI)
print(effective_balance // GWEI)
print(epoch)