Part 3: Annotated Specification

Fork Choice

Introduction

The beacon chain's fork choice is documented separately from the main state transition specification. Like the main specification, the fork choice spec is incremental, with later versions specifying only the changes since the previous version. When annotating the main spec I combined the incremental versions into a single up-to-date document. In the following, however, I will deal separately with the original Phase 0 fork choice and the incremental Bellatrix fork choice update as the latter mainly introduced one-off functionality specific to the Merge transition.

What's a fork choice?

As described in the introduction to consensus, a fork choice rule is the means by which a node decides, given the information available to it, which block is the "best" head of the chain. A good fork choice rule results in the network of nodes eventually converging on the same canonical chain: it is able to resolve forks consistently, even under a degree of faulty or adversarial behaviour.

Ethereum's proof of stake consensus introduces a Store object that contains all the data necessary for determining a best head. A node's Store is the "source of truth" for its fork choice rule. In classical consensus terms it is a node's local view: all the relevant information that a node has about the network state. The fork choice rule can be characterised as a function, $\text{GetHead}(\text{Store}) \rightarrow \text{HeadBlock}$ .

During the Merge event, the beacon chain's fork choice was temporarily augmented to be able to consider blocks on the Eth1 chain, in order to agree which (of potentially multiple candidates) would become the terminal proof of work block.

Overview

Ethereum's fork choice comprises the LMD GHOST fork choice rule, modified by (constrained by) the Casper FFG fork choice rule. The Casper FFG rule modifies the LMD GHOST fork choice by only allowing blocks descended from the last finalised¹ checkpoint to be candidates for the chain head. All earlier branches are effectively pruned out of a node's local view of the network state.

Casper FFG's role is to finalise a checkpoint. History prior to the finalised checkpoint is a linear chain of blocks with all branches pruned away. LMD GHOST is used to select the best head block at any time. LMD GHOST is constrained by Casper FFG in that it operates on the block tree only after the finalised checkpoint.

This combination has come to be known as "Gasper", and appears to be relatively simple at first sight. However, the emergence of various edge cases, and a relentless stream of potential attacks has led third party researchers to declare that "The Gasper protocol is complex". And that remark was made before implementing many of the fixes that we'll be reviewing in the following sections. Vitalik himself has written that

The "interface" between Casper FFG finalization and LMD GHOST fork choice is a source of significant complexity, leading to a number of attacks that have required fairly complicated patches to fix, with more weaknesses being regularly discovered.

Despite all this, we are happily running Ethereum on top of the Gasper protocol today. We continue to incrementally add defences against known attacks, and one day we may move on from Gasper entirely - perhaps to a single slot finality protocol, or to Casper CBC. Meanwhile, Gasper is proving to be "good enough" in practice.²

Scope and terminology

These fork choice specification documents don't cover the whole mechanism. They are largely concerned only with the LMD GHOST fork choice; the Casper FFG side of things (justification and finalisation) is dealt with in the main state-transition specification.

The terms attestation, vote, and message appear frequently. An attestation is a collection of three votes: a vote for a source checkpoint, a vote for a target checkpoint, and a vote for a head block. The source and target votes are used by Casper FFG, and the head vote is used by LMD GHOST. We will mostly be concerned with head votes in the following sections, except when stated otherwise. LMD GHOST head votes are also called messages, being the "M" in "LMD".

Where we discuss attestations, they can be a single attestation from one validator, or aggregate attestations containing the attestations of multiple validators that made the same set of votes. It will be clear from the context which of these applies.

Decoding dev-speak

Sometimes you'll hear protocol devs say slightly obscure things like, "we can deal with that in fork choice". For example, "we can handle censorship via the fork choice".

This framing makes sense when we understand that a node's fork choice rule is its expression of which chain it prefers to follow, or prefers not to follow. No honest node wants to follow a chain that contains invalid blocks (according to the state transition), so the fork choice of all honest nodes will never select a head block that has an invalid block in its ancestry.

Similarly, nodes could modify their fork choice rule so that branches with blocks that appear to censor transactions are never selected. If nodes with sufficient validators do this, then any such block will be orphaned, strongly discouraging censorship. This works both ways, of course. A government could declare that the fork choice must ignore any branches with blocks that do not censor transactions. If enough validators – over half – choose to comply, then the whole chain will become censoring.

The goal of the fork choice is for the network to converge onto a single history, so there is a strong incentive to try to agree with one's peers. However, it also provides a mechanism that can be used (perhaps as an outcome of social coordination) to be opinionated about what kind of blocks are eventually included in that history.

History

Proof of stake Ethereum has a long history that we shall review elsewhere. The following milestones are significant for the current Casper FFG plus LMD GHOST implementation.

Vitalik published the original mini-spec for the beacon chain's proof of stake consensus on July 31st 2018, shortly after we had abandoned prior designs for moving Ethereum to PoS. The initial design used IMD GHOST (Immediate Message Driven GHOST) in which attestations have a limited lifetime in the fork choice³. IMD GHOST was changed to LMD GHOST (Latest Message Driven GHOST) in November 2018 due to concerns about the stability property of IMD.

The initial fork choice spec was published to GitHub in April 2019, numbering a mere 96 lines. The current Phase 0 fork choice spec has 576 lines.

Various issues have caused the fork choice specification to balloon in complexity.

In August 2019, a "decoy flip-flop attack" on LMD GHOST was identified that could be used by an adversary to delay finalisation (for a limited period of time). The defence against this is to add a check that newly considered attestations are from either the current or previous epoch only. We'll cover this under validate_on_attestation().

In September 2019 a "bouncing attack" on Casper FFG was identified that can delay finalisation indefinitely. Up to the Capella spec release we had a fix for this that only allowed the fork choice's justified checkpoint to be updated during the early part of an epoch. The fix was removed in the Capella upgrade since it adds significant complexity to the fork choice, and in any case can be worked around by splitting honest validators' views. The bouncing attack is very difficult to set up and an adversary with the power to do this could probably attack the chain in more interesting ways. The bouncing attack and its original fix remain documented in the Bellatrix edition.

Around November 2019 it became apparent that, in order to properly apply Casper FFG to LMD GHOST, it is necessary to filter "unviable branches" from the fork choice. This is discussed in detail in the section, Why prune unviable branches?

In July 2021, an edge case was identified in which (if 1/3 of validators were prepared to be slashed) the invariant that the store's justified checkpoint must be a descendant of the finalised checkpoint could become violated. A fix to the on_tick() handler was implemented to maintain the invariant.

In November 2021, some overly complicated logic was identified in the on_block() handler that could lead to the Store retaining inconsistent finalised and justified checkpoints, which would in turn cause filter_block_tree() to fail. Over one third of validators would have had to be slashed to trigger the fault, but the resulting fix turned out to be a nice simplification in any case.

Proposer boost was also added in November 2021. This is a defence against potential balancing attacks on LMD GHOST that could prevent Casper FFG from finalising. We'll cover this in detail in the proposer boost section.

A new type of balancing attack was published in January 2022 that relies on the attacker's validators making equivocating attestations (multiple different attestations at the same slot). To counter this, a defence against equivocating indices was added in March 2022. We'll discuss this when we get to the on_attester_slashing() handler. This defence was bolstered in the Capella spec update by excluding all slashed validators from having an influence in the fork choice.

Several issues involving "unrealised justfication" were discovered during the first half of 2022, arising from the November 2019 fixes to filter viable blocks. First, an unrealised justification reorg attack that allowed the proposer of the first block of an epoch to easily fork out up to nine blocks from the end of the previous epoch. A variant of that attack was also found to be able to cause validators to make slashable attestations. Second, a justification withholding attack that an adversary could use to reorg arbitrary numbers of blocks at the start of an epoch. These issues were addressed in the Capella spec update with the "pull up tips" and unrealised justification logic that it introduced.

A reader might infer from this catalogue of issues that the fork choice is fiendishly difficult to reason about, and the reader would not be wrong. Some long-overdue formal verification work on the fork choice rule has recently been completed. It seeks to prove certain desirable properties, such as that an honest validator following the rules can never make slashable attestations.

We will study each of the issues above in more detail as we work through the fork choice specification in the following two sections.

Phase 0 fork choice is the main fork choice specification.
Bellatrix fork choice covers the changes to the fork choice around the Merge.

Note that the Capella upgrade included a substantial rewrite of the fork choice specification. The rewrite removed the bouncing attack fix and introduced the "pull up tips" defence against a new attack, among other things. The following sections are based on the updated Capella version, but the previous annotated fork choice remains available. All of these changes were quietly rolled out prior to Capella, buried within various client software updates, while the updated spec was held back until the Capella upgrade itself⁴. A public disclosure of the issues was made a few weeks after the Capella upgrade.

I'm simplifying here. LMD GHOST can only consider descendants of the last justified checkpoint at any one time. But the last justified checkpoint can change. LMD GHOST will never consider branches from before the last finalised checkpoint. More on this later. ↩
Appendix C.1 of the Goldfish, "No More Attacks on Proof-of-Stake Ethereum?" paper is a useful overview of known weaknesses of Gasper consensus. ↩
If I've understood correctly. Traces of IMD GHOST are difficult to find these days, and that's probably for the better. ↩
When issues are found with the fork choice it is common for them to be "silently" fixed in client releases before being made public. Paradoxically, changes to the fork choice rule are not consensus breaking and do not usually require simultaneous activation in clients. Hard fork upgrades such as Capella ensure that all node operators have, of necessity, upgraded to the latest software versions, at which point it is safe to publish details of the problems and fixes. This is one good reason for keeping your software up to date, even between mandatory upgrades. ↩