Part 2: Technical Overview

The Incentive Layer

Inactivity leak

When the beacon chain is not finalising it enters a special "inactivity leak" mode.
Attesters receive no rewards. Non-participating validators receive increasingly large penalties based on their track records.
This is designed to restore finality in the event of the permanent failure of large numbers of validators.

Introduction

If the beacon chain hasn't finalised a checkpoint for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY (4) epochs, then it enters "inactivity leak" mode.

The inactivity leak is a kind of emergency state in which rewards and penalties are modified as follows.

Attesters receive no attestation rewards while attestation penalties are unchanged.
Any validators deemed inactive have their inactivity scores raised, leading to an additional inactivity penalty that potentially grows quadratically with time. This is the inactivity leak, sometimes known as the quadratic leak.
Proposer and sync committee rewards are unchanged.

The idea for the inactivity leak was proposed in the original Casper FFG paper. The problem it addresses is that of how to recover finality (liveness, in some sense) in the event that over one-third of validators goes offline. Finality requires a majority vote from validators representing 2/3 of the total stake.

The mechanism works as follows. When loss of finality is detected the inactivity leak gradually reduces the stakes of validators who are not making attestations until, eventually, the participating validators control 2/3 of the remaining stake. They can then begin to finalise checkpoints once again.

This inactivity penalty mechanism is designed to protect the chain long-term in the face of catastrophic events (sometimes referred to as the ability to survive World War III). The result might be that the beacon chain could permanently split into two independent chains either side of a network partition, and this is assumed to be a reasonable outcome for any problem that can't be fixed in a few weeks. In this sense, the beacon chain formally prioritises availability over consistency. (You can't have both.)

In any case, it provides a powerful incentive for stakers to fix any issues they have and to get back online.

The reason why no validators receive attestation rewards during an inactivity leak is once again due to the possibility of discouragement attacks. An attacker might deliberately drive the beacon chain into an inactivity leak, perhaps by a combination of censorship and denial of service attack on other validators. This would cause the non-participants to suffer the leak, while the attacker continues to attest normally. We need to increase the cost to the attacker in this scenario, which we do by not rewarding attestations at all during an inactivity leak.

As with penalties, the amounts subtracted from validators' beacon chain accounts due to the inactivity leak are effectively burned, reducing the overall net issuance of the beacon chain.

Mathematics

Let's study the effect of the leak on a single validator's balance, assuming that during the period of the inactivity leak (non-finalisation) the validator is completely offline.

At each epoch, the offline validator will be penalised an amount proportional to $tB / \alpha$ , where $t$ is the number of epochs since the chain last finalised, $B$ is the validator's effective balance, and $\alpha$ is the INACTIVITY_PENALTY_QUOTIENT.

The effective balance $B$ will remain constant for a while, by design, during which time the total amount of the penalty after $t$ epochs would be $t(t+1)B / 2\alpha$ : the famous "quadratic leak". If $B$ were continuously variable, the penalty would satisfy $\frac{dB}{dt}=-\frac{tB}{\alpha}$ , which can be solved to give the exponential $B(t)=B_0e^{-t^2/2\alpha}$ . The actual behaviour is somewhere between these two since the effective balance decreases in a step-wise fashion.

In the continuous case, the INACTIVITY_PENALTY_QUOTIENT, $\alpha$ , is the square of the time it takes to reduce the balance of a non-participating validator to $1 / \sqrt{e}$ , or around 60.7% of its initial value. With the value of INACTIVITY_PENALTY_QUOTIENT at 3 * 2**24, this equates to around seven thousand epochs, or 31.5 days.

For Phase 0 of the beacon chain, the value of INACTIVITY_PENALTY_QUOTIENT was increased by a factor of four from $2^{24}$ to $2^{26}$ , so that validators would be penalised less severely if there were non-finalisation due to implementation problems in the early days. As it happens, there were no instances of non-finalisation during the whole eleven months of Phase 0 of the beacon chain.

The value was decreased by one quarter in the Altair upgrade from $2^{26}$ to $3 \times 2^{24}$ as a step towards eventually setting it to its final value. Decreasing the inactivity penalty quotient speeds up recovery of finalisation in the event of an inactivity leak.

Inactivity scores

During Phase 0, the inactivity penalty was an increasing global amount applied to all validators that did not participate in an epoch, regardless of their individual track records of participation. So a validator that was able to participate for a significant fraction of the time could still be quite severely penalised due to the growth of the inactivity penalty. Vitalik gives a simplified example: "if fully [off]line validators get leaked and lose 40% of their balance, someone who has been trying hard to stay online and succeeds at 90% of their duties would still lose 4% of their balance. Arguably this is unfair." We found during the Medalla testnet incident that keeping a validator online when all around you is chaos is not easy. We don't want to punish stakers who are honestly doing their best.

To improve this, the Altair upgrade introduced individual validator inactivity scores that are stored in the state. The scores are updated each epoch as follows.

Every epoch, irrespective of the inactivity leak,
- decrease the score by one when the validator makes a correct timely target vote, and
- increase the score by INACTIVITY_SCORE_BIAS (four) otherwise.
When not in an inactivity leak,
- decrease every validator's score by INACTIVITY_SCORE_RECOVERY_RATE (sixteen).

Graphically, the flow-chart looks like this.

How each validator's inactivity score is updated. The happy flow is right through the middle.

Note that there is a floor of zero on the score.

When not in an inactivity leak validators' inactivity scores are reduced by INACTIVITY_SCORE_RECOVERY_RATE + 1 per epoch when they make a timely target vote, and by INACTIVITY_SCORE_RECOVERY_RATE - INACTIVITY_SCORE_BIAS when they don't. So, even for non-performing validators, scores decrease outside a leak.

When in a leak, if $p$ is the participation rate between $0$ and $1$ , and $\lambda$ is INACTIVITY_SCORE_BIAS, then the expected score after $N$ epochs is $\max (0, N((1-p)\lambda - p))$ . For $\lambda = 4$ this is $\max (0, N(4 - 5p))$ . So a validator that is participating 80% of the time or more can maintain a score that is bounded near zero. With less than 80% average participation, its score will increase unboundedly.

This is nice because, if many validators are able to participate intermittently, it indicates that whatever event has befallen the chain is potentially recoverable, unlike a permanent network partition, or a super-majority network fork, for example. The inactivity leak is intended to bring finality to irrecoverable situations, so prolonging the time to finality if it's recoverable is likely a good thing.

The following graph illustrates some scenarios. We have an inactivity leak that starts at zero, and ends after 100 epochs, after which finality is recovered and we are no longer in the leak. There are five validators. Working up from the lowest line, they are:

Always online: correctly registering a timely target vote in every epoch. The inactivity score remains at zero.
90% online: the inactivity score remains bounded near zero. From the analysis above, it is expected that anything better than 80% online will bound the score near zero.
70% online: the inactivity score grows slowly over time.
Generally online, but offline between epochs 50 and 75: the inactivity score is zero during the initial online period; grows linearly and fairly rapidly while offline during the leak; declines slowly when back online during the leak; and declines rapidly once the leak is over.
Always offline: the inactivity score increases rapidly during the leak, and declines even more rapidly once the leak is over.

The inactivity scores of five different validator personas in an inactivity leak that starts at zero and ends at epoch 100 (labelled "End" and shown with a dashed line). The dotted lines labelled "A" and "B" mark the start and end of the offline period for the fourth validator.

Inactivity penalties

The inactivity penalty is applied to all validators at every epoch based on their individual inactivity scores, irrespective of whether a leak is in progress or not. When there is no leak, the scores return to zero (rapidly for active validators, less rapidly for inactive ones), so most of the time this is a no-op.

The penalty for validator $i$ is calculated as

\begin{split} s_{i}B_{i} / (\tt{INACTIVITY\_SCORE\_BIAS} \times \tt{INACTIVITY\_PENALTY\_QUOTIENT\_ALTAIR}) \\ = \frac{s_{i}B_{i}}{4 \times 50{,}331{,}648} \end{split}

where $s_i$ is the validator's inactivity score, and $B_i$ is the validator's effective balance.

This penalty is applied at each epoch, so (for constant $B_i$ ) the total penalty is proportional to the area under the curve of the inactivity score, above. With the same five validator persona's we can quantify the penalties in the following graph.

Always online: no penalty due to the leak.
90% online: negligible penalty due to the leak.
70% online: the total penalty grows quadratically but slowly during the leak, and rapidly stops after the leak ends.
Generally online, but offline between epochs 50 and 75: a growing penalty during the leak, that rapidly stops when the leak ends.
Always offline: we can clearly see the quadratic nature of the penalty in the initial parabolic shape of the curve. After the end of the leak it takes around 35 epochs for the penalties to return to zero.

The balance retained by each of the five validator personas after the inactivity leak penalty has been applied. The scenario is identical to the chart above.

We can see that the new scoring system means that some validators will continue to be penalised due to the leak even after finalisation starts again. This is intentional. When the leak causes the beacon chain to finalise, at that point we have just two-thirds of the stake online. If we immediately stop the leak (as we used to), then the amount of stake online would remain close to two-thirds and the chain would be vulnerable to flipping in and out of finality as small numbers of validators come and go. We saw this behaviour on some of the testnets prior to launch. Continuing the leak after finalisation serves to increase the balances of participating validators to greater than two-thirds, providing a buffer that should mitigate such behaviour.