Part 3: Annotated Specification

Fork Choice

Phase 0 Fork Choice

This section covers the Phase 0 Fork Choice document. It is based on the Bellatrix, 1.2.0, spec release version. A major rewrite of the fork choice rule is due to be released with the Capella upgrade that I will document in a future version of this annotated specification.

For an alternative take, I recommend Vitalik's annotated fork choice document. I deliberately didn't consult that while preparing this, so that we gain the value of two independent expositions.

Block-quoted content below (with a sidebar) has been copied over verbatim from the specs repo, as well as all the function code.

The head block root associated with a store is defined as get_head(store). At genesis, let store = get_forkchoice_store(genesis_state, genesis_block) and update store by running:

  • on_tick(store, time) whenever time > store.time where time is the current Unix time
  • on_block(store, block) whenever a block block: SignedBeaconBlock is received
  • on_attestation(store, attestation) whenever an attestation attestation is received
  • on_attester_slashing(store, attester_slashing) whenever an attester slashing attester_slashing is received

Any of the above handlers that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid. Invalid calls to handlers must not modify store.

Updates to the Store arise only through the four handler functions: on_tick(), on_block(), on_attestation(), and on_attester_slashing(). These are the four senses through which the fork choice gains its knowledge of the world.

Notes:

  1. Leap seconds: Slots will last SECONDS_PER_SLOT + 1 or SECONDS_PER_SLOT - 1 seconds around leap seconds. This is automatically handled by UNIX time.

Leap seconds will no longer occur after 2035. We can remove this note after that.

  1. Honest clocks: Honest nodes are assumed to have clocks synchronized within SECONDS_PER_SLOT seconds of each other.

In practice, the synchrony assumptions are stronger than this. Any node whose clock is more than SECONDS_PER_SLOT / INTERVALS_PER_SLOT (four seconds) adrift will suffer degraded performance and can be considered Byzantine (faulty), at least for the LMD GHOST fork choice.

  1. Eth1 data: The large ETH1_FOLLOW_DISTANCE specified in the honest validator document should ensure that state.latest_eth1_data of the canonical beacon chain remains consistent with the canonical Ethereum proof-of-work chain. If not, emergency manual intervention will be required.

Post-Merge, consistency between the execution and consensus layers is no longer an issue, although we retain the ETH1_FOLLOW_DISTANCE for now.

  1. Manual forks: Manual forks may arbitrarily change the fork choice rule but are expected to be enacted at epoch transitions, with the fork details reflected in state.fork.

Manual forks are sometimes called hard forks or upgrades, and are planned in advance and coordinated. They are different from the inadvertent forks that the fork choice rule is designed to resolve.

  1. Implementation: The implementation found in this specification is constructed for ease of understanding rather than for optimization in computation, space, or any other resource. A number of optimized alternatives can be found here.

After reading the spec you may be puzzled by the "ease of understanding" claim. However, it is certainly true that several of the algorithms are far from efficient, and a great deal of optimisation is needed for practical implementations.

Constant

Name Value
INTERVALS_PER_SLOT uint64(3)

Only blocks that arrive during the first 1 / INTERVALS_PER_SLOT of a slot's duration are eligible to have the proposer score boost added. This moment is the point in the slot at which validators are expected to publish attestations declaring their view of the head of the chain.

In the Ethereum consensus specification INTERVALS_PER_SLOT neatly divides SECONDS_PER_SLOT, and all time quantities are strictly uint64 numbers of seconds. However, other chains that run the same basic protocol as Ethereum might not have this property. For example, the Gnosis Beacon Chain has five-second slots. We changed Teku's internal clock from seconds to milliseconds to support this, which is technically off-spec, but nothing broke.

Preset

Name Value Unit Duration
SAFE_SLOTS_TO_UPDATE_JUSTIFIED 2**3 (= 8) slots 96 seconds

This is used in should_update_justified_checkpoint() to constrain when we may update the Store's justified checkpoint. The Store will only consent to switch to a conflicting justified checkpoint (one not descended from the current justified checkpoint) during the first SAFE_SLOTS_TO_UPDATE_JUSTIFIED slots of an epoch. The intention is to address a bouncing attack that was identified fairly early the beacon chain's design and that could delay finality.

The recommended value of SAFE_SLOTS_TO_UPDATE_JUSTIFIED is less than SLOTS_PER_EPOCH / 3 slots. There seems to be no surviving rationale for why it was set to 8 rather than 10. I assume it's to do with the our obsession with powers of two.

Note that this mechanism was removed in a more recent update to the fork choice due to it having limited effectiveness and simplicity being preferred.

Configuration

Name Value
PROPOSER_SCORE_BOOST uint64(40)
  • The proposer score boost is worth PROPOSER_SCORE_BOOST percentage of the committee's weight, i.e., for slot with committee weight committee_weight the boost weight is equal to (committee_weight * PROPOSER_SCORE_BOOST) // 100.

Proposer boost is a modification to the fork choice rule that defends against a so-called balancing attack. When a timely block proposal is received, proposer boost temporarily adds a huge weight to that block's branch in the fork choice calculation, namely PROPOSER_SCORE_BOOST percent of the total effective balances of all the validators assigned to attest in that slot.

The value of PROPOSER_SCORE_BOOST has changed over time as the balancing attack has been analysed more thoroughly.

The basic trade-off in choosing a value for PROPOSER_SCORE_BOOST is between allowing an adversary to perform "ex-ante" or "ex-post" reorgs. Setting PROPOSER_SCORE_BOOST too high makes it easier for an adversarial proposer to perform ex-post reorgs - it gives the proposer disproportionate power compared with the votes of validators. Setting PROPOSER_SCORE_BOOST too low makes it easier for an adversary to perform ex-ante reorgs. Caspar Schwarz-Schilling covers these trade-offs nicely in his Liscon talk, The game of reorgs in PoS Ethereum.1

Helpers

LatestMessage

class LatestMessage(object):
    epoch: Epoch
    root: Root

This is just a convenience class for tracking the most recent head vote from each validator - the "LM" (latest message) in LMD GHOST. Epoch is a uint64 type, and Root is a Bytes32 type. The Store holds a mapping of validator indices to their latest messages.

Store

class Store(object):
    time: uint64
    genesis_time: uint64
    justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint
    best_justified_checkpoint: Checkpoint
    proposer_boost_root: Root
    equivocating_indices: Set[ValidatorIndex]
    blocks: Dict[Root, BeaconBlock] = field(default_factory=dict)
    block_states: Dict[Root, BeaconState] = field(default_factory=dict)
    checkpoint_states: Dict[Checkpoint, BeaconState] = field(default_factory=dict)
    latest_messages: Dict[ValidatorIndex, LatestMessage] = field(default_factory=dict)

A node's Store records all the fork choice related information that it has about the outside world. It is the node's view of the network, in more classical terms. The Store is updated only by the four handler functions.

The basic fields are as follows.

  • time: The wall-clock time (Unix time) of the last call to the on_tick() handler. In theory this is update continuously; in practice only at least two or three times per slot.
  • justified_checkpoint: Our node's view of the currently justified checkpoint.
  • finalized_checkpoint: Our node's view of the currently finalised checkpoint.
  • blocks: All the blocks that we know about that are descended from the finalized_checkpoint. The fork choice spec does not describe how to prune the Store, so we would end up with all blocks since genesis if we were to follow it precisely. However, only blocks descended from the last finalised checkpoint are ever considered in the fork choice, and the finalised checkpoint only increases in height. So it is safe for client implementations to remove from the Store all blocks (and their associated states) belonging to branches not descending from the last finalised checkpoint.
  • block_states: For every block in the Store, we also keep its corresponding (post-)state. These states are mostly used for information about justification and finalisation.
  • checkpoint_states: If there are empty slots immediately before a checkpoint then the checkpoint state will not correspond to a block state, so we store checkpoint states as well, indexed by Checkpoint rather than block root. The state at the last justified checkpoint is used for validator balances, and for validating attestations in the on_attester_slashing() handler.
  • latest_messages: The set of latest head votes from validators. When the on_attestation() handler processes a new head vote for a validator, it gets added to this set and the old vote is discarded.

The following fields were added at various times as new attacks and defences were found.

  • best_justified_checkpoint was added to the Store to defend against the FFG bouncing attack.
  • proposer_boost_root was added when proposer boost was implemented as a defence against the LMD balancing attack. It is set to the root of the current block for the duration of a slot, as long as that block arrived within the first third of a slot.
  • The equivocating_indices set was added to defend against the equivocation balancing attack. It contains the indices of any validators reported as having committed an attester slashing violation. These validators must be removed from consideration in the fork choice rule until the last justified checkpoint state catches up with the fact that the validators have been slashed.

For non-Pythonistas, Set and Dict are Python generic types. A Set is an unordered collection of objects; a Dict provides key–value look-up.

get_forkchoice_store

The provided anchor-state will be regarded as a trusted state, to not roll back beyond. This should be the genesis state for a full client.

Note With regards to fork choice, block headers are interchangeable with blocks. The spec is likely to move to headers for reduced overhead in test vectors and better encapsulation. Full implementations store blocks as part of their database and will often use full blocks when dealing with production fork choice.

def get_forkchoice_store(anchor_state: BeaconState, anchor_block: BeaconBlock) -> Store:
    assert anchor_block.state_root == hash_tree_root(anchor_state)
    anchor_root = hash_tree_root(anchor_block)
    anchor_epoch = get_current_epoch(anchor_state)
    justified_checkpoint = Checkpoint(epoch=anchor_epoch, root=anchor_root)
    finalized_checkpoint = Checkpoint(epoch=anchor_epoch, root=anchor_root)
    proposer_boost_root = Root()
    return Store(
        time=uint64(anchor_state.genesis_time + SECONDS_PER_SLOT * anchor_state.slot),
        genesis_time=anchor_state.genesis_time,
        justified_checkpoint=justified_checkpoint,
        finalized_checkpoint=finalized_checkpoint,
        best_justified_checkpoint=justified_checkpoint,
        proposer_boost_root=proposer_boost_root,
        equivocating_indices=set(),
        blocks={anchor_root: copy(anchor_block)},
        block_states={anchor_root: copy(anchor_state)},
        checkpoint_states={justified_checkpoint: copy(anchor_state)},
    )

get_forkchoice_store() initialises the fork choice Store object from an anchor state and its corresponding block (header). As noted, the anchor state could be the genesis state. Equally, when using a checkpoint sync, the anchor state will be the finalised checkpoint state provided by the node operator, which will be treated as if it is a genesis state. In either case, the latest_messages store will be empty to begin with.

get_slots_since_genesis

def get_slots_since_genesis(store: Store) -> int:
    return (store.time - store.genesis_time) // SECONDS_PER_SLOT

Self explanatory. This one of only two places that store.time is used, the other being in the proposer boost logic in the on_block() handler.

Used by get_current_slot()

get_current_slot

def get_current_slot(store: Store) -> Slot:
    return Slot(GENESIS_SLOT + get_slots_since_genesis(store))

Self explanatory. GENESIS_SLOT is usually zero.

Used by should_update_justified_checkpoint(), validate_target_epoch_against_current_time(), validate_on_attestation(), on_tick(), on_block()
Uses get_slots_since_genesis()

compute_slots_since_epoch_start

def compute_slots_since_epoch_start(slot: Slot) -> int:
    return slot - compute_start_slot_at_epoch(compute_epoch_at_slot(slot))

Self explanatory. Used only for the bouncing attack defence.

Used by should_update_justified_checkpoint(), on_tick()
Uses compute_start_slot_at_epoch()

get_ancestor

def get_ancestor(store: Store, root: Root, slot: Slot) -> Root:
    block = store.blocks[root]
    if block.slot > slot:
        return get_ancestor(store, block.parent_root, slot)
    elif block.slot == slot:
        return root
    else:
        # root is older than queried slot, thus a skip slot. Return most recent root prior to slot
        return root

get_ancestor() recursively walks backwards through the chain. It starts with a given block with root hash root, and finds its ancestor block at slot slot, returning the ancestor's root hash. If the desired slot is empty (on this branch) then it returns the most recent root prior to slot on this branch. (When the block at slot slot has root root, then that block root is returned, so it should probably be called get_ancestor_or_self() or something.)

This function is sometimes used just to confirm that the block with root root is descended from a particular block at slot slot, and sometimes used actually to retrieve that ancestor block's root.

Used by get_latest_attesting_balance(), should_update_justified_checkpoint(), validate_on_attestation(), on_tick(), on_block()

get_latest_attesting_balance

This function is arguably misnamed, and in the latest spec versions has been renamed to get_weight().

def get_latest_attesting_balance(store: Store, root: Root) -> Gwei:
    state = store.checkpoint_states[store.justified_checkpoint]
    active_indices = get_active_validator_indices(state, get_current_epoch(state))
    attestation_score = Gwei(sum(
        state.validators[i].effective_balance for i in active_indices
        if (i in store.latest_messages
            and i not in store.equivocating_indices
            and get_ancestor(store, store.latest_messages[i].root, store.blocks[root].slot) == root)
    ))
    if store.proposer_boost_root == Root():
        # Return only attestation score if ``proposer_boost_root`` is not set
        return attestation_score

    # Calculate proposer score if ``proposer_boost_root`` is set
    proposer_score = Gwei(0)
    # Boost is applied if ``root`` is an ancestor of ``proposer_boost_root``
    if get_ancestor(store, store.proposer_boost_root, store.blocks[root].slot) == root:
        num_validators = len(get_active_validator_indices(state, get_current_epoch(state)))
        avg_balance = get_total_active_balance(state) // num_validators
        committee_size = num_validators // SLOTS_PER_EPOCH
        committee_weight = committee_size * avg_balance
        proposer_score = (committee_weight * PROPOSER_SCORE_BOOST) // 100
    return attestation_score + proposer_score

Here we find the essence of the GHOST2 protocol: the weight of a block is the sum of the votes for that block, plus the votes for all of its descendant blocks. We include votes for descendants when calculating a block's weight because a vote for a block is an implicit vote for all of that block's ancestors as well - if a particular block gets included on chain, all its ancestors must also be included. To put it another way, we treat validators as voting for entire branches rather than just their leaves.

Ignoring the proposer boost part for the time being, the main calculation being performed is as follows.

    state = store.checkpoint_states[store.justified_checkpoint]
    active_indices = get_active_validator_indices(state, get_current_epoch(state))
    attestation_score = Gwei(sum(
        state.validators[i].effective_balance for i in active_indices
        if (i in store.latest_messages
            and i not in store.equivocating_indices
            and get_ancestor(store, store.latest_messages[i].root, store.blocks[root].slot) == root)
    ))

Given a block root, root, this adds up all the votes for blocks that are descended from that block. More precisely, it calculates the sum of the effective balances of all validators whose latest head vote was for a descendant of root or for root itself. It's the fact that we're basing our weight calculations only on each validator's latest vote that makes this "LMD" (latest message drive) GHOST.

Diagram of a block tree with weights and latest attesting balances shown for each block

BNB_N is the sum of the effective balances of the validators whose most recent head vote was for block NN, and WNW_N is the weight of the branch starting at block NN.

Some obvious relationships apply between the weights, WxW_x, of blocks, and BxB_x, the latest attesting balances of blocks.

  • For a leaf block NN (a block with no children), WN=BNW_N = B_N.
  • The weight of a block is its own latest attesting balance plus the sum of the weights of its direct children. So, in the diagram, W1=B1+W2+W3W_1 = B_1 + W_2 + W_3.

These relationships can be used to avoid repeating lots of work by memoising the results.

Proposer boost

In September 2020, shortly before mainnet genesis, a theoretical "balancing attack" on the LMD GHOST consensus mechanism was published, with an accompanying Ethresear.ch post.

The balancing attack allows a very small number of validators controlled by an adversary to perpetually maintain a forked network, with half of all validators following one fork and half the other. This would delay finalisation indefinitely, which is a kind of liveness failure. Since the attack relies on some unrealistic assumptions about the power an adversary has over the network – namely, fine-grained control over who can see what and when – we felt that the potential attack was not a significant threat to the launch of the beacon chain. Later refinements to the attack appear to have made it more practical to execute, however.

A modification to the fork choice to mitigate the balancing attack was first suggested by Vitalik. This became known as proposer boost, and a version of it was adopted into the consensus layer specification in late 2021 with the various client teams releasing versions with mainnet support for proposer boost in April and May 2022.

Changes to the fork choice can be made outside major protocol upgrades; it is not strictly necessary for all client implementations to make the change simultaneously, as they must for hard-fork upgrades. Given this, mainnet client releases supporting proposer boost were made at various times in April and May 2022, and users were not forced to upgrade on a fixed schedule. Unfortunately, having a mix of nodes on the network, around half applying proposer boost and half not, led to a seven block reorganisation of the beacon chain on May 25, 2022. As a result, subsequent updates to the fork choice have tended to be more tightly coordinated between client teams.

Proposer boost details

Proposer boost modifies our nice, intuitive calculation of a branch's weight, based only on latest votes, by adding additional weight to a block that was received on time in the current slot. In this way, it introduces a kind of synchrony weighting. Vitalik calls this "an explicit 'synchronization bottleneck' gadget". In short, it treats a timely block as being a vote with a massive weight that is temporarily added to the branch that it is extending.

The simple intuition behind proposer boost is summarised by Barnabé Monnot as, "a block that is timely shouldn’t expect to be re-orged". In respect of the balancing attack, proposer boost is designed to overwhelm the votes from validators controlled by the adversary and instead allow the proposer of the timely block to choose the fork that will win. Quoting Francesco D'Amato, "the general strategy is to empower honest proposers to impose their view of the fork-choice, but without giving them too much power and making committees irrelevant".

Note that the proposer boost calculation in this spec version is over-complicated, possibly due to concerns about integer overflows. The calculation has been simplified to the following in more recent spec versions.

    if store.proposer_boost_root == Root():
        # Return only attestation score if ``proposer_boost_root`` is not set
        return attestation_score
    # Boost is applied if ``root`` is an ancestor of ``proposer_boost_root``
    if get_ancestor(store, store.proposer_boost_root, store.blocks[root].slot) == root:
        committee_weight = get_total_active_balance(state) // SLOTS_PER_EPOCH
        proposer_score = (committee_weight * PROPOSER_SCORE_BOOST) // 100
    return attestation_score + proposer_score

The default setting for store.proposer_boost_root is Root(). That is, the "empty" or "null" default SSZ root value, with all bytes set to zero. Whenever a block is received during the first 1 / INTERVALS_PER_SLOT portion of a slot – that is, when the block is timely – store.proposer_boost_root is set to the hash tree root of that block by the on_block() handler. At the end of each slot it is reset to Root() by the on_tick() handler. Thus, proposer boost has an effect on the fork choice calculation from the point at which a timely block is received until the end of that slot, where "timely" on Ethereum's beacon chain means "within the first four seconds".

Proposer boost causes entire branches to be favoured when the block at their tip is timely. When proposer boost is in effect, and the timely block in the current slot (which has root, store.proposer_boost_root) is descended from the block we are calculating the weight for, then that block's weight is also increased, since the calculation includes the weights of all its descendants. In this way, proposer boost weighting propagates to the boosted block's ancestors in the same way as vote weights do.

The weight that proposer boost adds to the block's branch is a percentage PROPOSER_SCORE_BOOST of the total effective balance of all validators due to attest at that slot. Rather, it is an approximation to the total effective balance for that slot, derived by dividing the total effective balance of all validators by the number of slots per epoch.

The value of PROPOSER_SCORE_BOOST has changed over time before settling at its current 40%. See the description there for the history, and links to how the current value was calculated.

Proposer boost and late blocks

A side-effect of proposer boost is that it enables clients to reliably re-org out (orphan) blocks that were published late. Instead of building on a late block, the proposer can choose to build on the late block's parent.

A block proposer is supposed to publish its block at the start of the slot, so that it has time to be received and attested to by the whole committee within the first four seconds. However, post-merge, it can be profitable to delay block proposals by several seconds in order to collect more transaction income and better extractable value opportunities. Although blocks published five or six seconds into a slot will not gain many votes, they are still likely to remain canonical under the basic consensus spec. As long as the next block proposer receives the late block by the end of the slot, it will usually build on it as the best available head.3 This is undesirable as it punishes the vast majority of honest validators, that (correctly) voted for an empty slot, by depriving them of their reward for correct head votes, and possibly even penalising them for incorrect target votes at the start of an epoch.

Without proposer boost, it is a losing strategy for the next proposer not to build on a block that it received late. Although the late block may have few votes, it has more votes than your block initially, so validators will still attest to the late block as the head of the chain, keeping it canonical and orphaning the alternative block that you built on its parent.

With proposer boost, as long as the late block has fewer votes than the proposer boost percentage, the honest proposer can be confident that its alternative block will win the fork choice for long enough that the next proposer will build on that rather than on the late block it skipped.

Diagram showing a proposer choosing whether to build on a late block or its parent

Block BB was published late, well after the 4 second attestation cut-off time. However, it still managed to acquire a few attestations (say, 10% of the committee) due to dishonest or misconfigured validators. Should the next proposer build C1C_1 on top of the late block, or C2C_2 on top of its parent?

Diagram showing that without proposer score boosting a proposer should build on the late block

Without proposer boost, it only makes sense to build C1C_1, on top of the late block BB. Since BB has some weight, albeit small, the top branch will win the fork choice (if the network is behaving synchronously at the time). Block C2C_2 would be orphaned.

Diagram showing that with proposer score boosting a proposer may build on the late block's parent

With proposer boost, the proposer of CC can safely publish either C1C_1 or C2C_2. Due to the proposer score boost of 40%, it is safe to publish block C2C_2 that orphans BB since the lower branch will have greater weight during the slot.

An implementation of this strategy in the Lighthouse client seems to have been effective in reducing the number of late blocks on the network. Publishing of late blocks is strongly disincentivised when they are likely to be orphaned. It may be adopted as standard behaviour in the consensus specs at some point, but remains optional for the time-being. Several safe-guards are present in order to avoid liveness failures.

Note that Proposer boost does not in general allow validators to re-org out timely blocks (that is, an ex-post reorg). A timely block ought to gain enough votes from the committees that it will always remain canonical.

Alternatives to proposer boost

Proposer boost is not a perfect solution to balancing attacks or ex-ante reorgs. It makes ex-post reorgs easier to accomplish; it does not scale with participation, meaning that if only 40% of validators are online, then proposers can reorg at will; it can fail when an attacker controls several consecutive slots over which to store up votes.

Some changes to, or replacements for, LMD GHOST have been suggested that do not require proposer score boosting.

View-merge4 is a mechanism in which attesters freeze their fork choice some time Δ\Delta before the end of a slot. The next proposer does not freeze its fork choice, however. The assumed maximum network delay is Δ\Delta, so the proposer will see all votes in time, and it will circulate a summary of them to all validators, contained within its block. This allows the whole network to synchronise on a common view. Balancing attacks rely on giving two halves of the network different views, and would be prevented by view-merge.

The Goldfish protocol, described in the paper No More Attacks on Proof-of-Stake Ethereum?, builds on view-merge (called "message buffering" there) and adds vote expiry so that head block votes expire almost immediately (hence the name - rightly or wrongly, goldfish are famed for their short memories). The resulting protocol is provably reorg resilient and supports fast confirmations.

Both view-merge and Goldfish come with nice proofs of their properties under synchronous conditions, which improve on Gasper under the same conditions. However, they may not fare so well under more realistic asynchronous conditions. The original view-merge article says of latency greater than 2 seconds, "This is bad". One of the authors of the Goldfish paper has said that Goldfish "is extremely brittle to asynchrony, allowing for catastrophic failures such as arbitrarily long reorgs"5, and elsewhere, "even a single slot of asynchrony can lead to a catastrophic failure, jeopardizing the safety of any previously confirmed block". At least with proposer boost, we know that it only degrades to normal Gasper under conditions of high latency.

Francesco D'Amato argues in Reorg resilience and security in post-SSF LMD-GHOST that the real origin of the reorg issues with LMD GHOST is our current committee-based voting: "The crux of the issue is that honest majority of the committee of a slot does not equal a majority of the eligible fork-choice weight", since an adversary is able to influence the fork choice with votes from other slots. The ultimate cure for this would be single slot finality (SSF), in which all validators vote at every slot. SSF is a long way from being practical today, but a candidate for its fork choice is RLMD-GHOST (Recent Latest Message Driven GHOST), which expires votes after a configurable time period.

Used by get_head()
Uses get_active_validator_indices(), get_ancestor(), get_total_active_balance()
See also on_tick(), on_block(), PROPOSER_SCORE_BOOST

filter_block_tree

def filter_block_tree(store: Store, block_root: Root, blocks: Dict[Root, BeaconBlock]) -> bool:
    block = store.blocks[block_root]
    children = [
        root for root in store.blocks.keys()
        if store.blocks[root].parent_root == block_root
    ]

    # If any children branches contain expected finalized/justified checkpoints,
    # add to filtered block-tree and signal viability to parent.
    if any(children):
        filter_block_tree_result = [filter_block_tree(store, child, blocks) for child in children]
        if any(filter_block_tree_result):
            blocks[block_root] = block
            return True
        return False

    # If leaf block, check finalized/justified checkpoints as matching latest.
    head_state = store.block_states[block_root]

    correct_justified = (
        store.justified_checkpoint.epoch == GENESIS_EPOCH
        or head_state.current_justified_checkpoint == store.justified_checkpoint
    )
    correct_finalized = (
        store.finalized_checkpoint.epoch == GENESIS_EPOCH
        or head_state.finalized_checkpoint == store.finalized_checkpoint
    )
    # If expected finalized/justified, add to viable block-tree and signal viability to parent.
    if correct_justified and correct_finalized:
        blocks[block_root] = block
        return True

    # Otherwise, branch not viable
    return False

The filter_block_tree() function finds the children of the given block and recursively visits them, so it explores the whole tree rooted at block_root. Since blockchains are singly linked, the only way to find the children is to search through every block in the Store for those that have parent block_root. This is one reason for keeping the Store well pruned.

Child blocks that are on branches terminating in a viable leaf block are inserted into the blocks dictionary structure. Note that blocks is mutated during execution and functions as a return value from filter_block_tree(), alongside the actual boolean return value.

"Viable" blocks are ones that have a post-state that agrees with my Store about the justified and finalised checkpoints.

Why prune unviable branches?

This function ensures that the Casper FFG fork choice rule, "follow the chain containing the justified checkpoint of the greatest height", is applied to the block tree before the LMD GHOST fork choice is evaluated.

Early versions of the spec considered the tip of any branch descended from the Store's justified checkpoint as a potential head block. However, a scenario was identified in which this could result in a deadlock, in which finality would not be able to advance without validators getting themselves slashed - a kind of liveness failure6.

The filter_block_tree() function was added as a fix for this issue. Given a Store and a block root, filter_block_tree() returns a list of all the blocks that we know about in the tree descending from the given block, having pruned out any branches that terminate in a leaf block that is not viable. Where, as above, a "viable" block has an a post-state in which the justified and finalised checkpoints match the justified and finalised checkpoints in my Store.

To illustrate the problem, consider the situation shown in the following diagrams, based on the original description of the issue. The context is that there is an adversary controlling 18% of validators that takes advantage of (or causes) a temporary network partition. We will illustrate the issue mostly in terms of checkpoints, and omit the intermediate blocks that carry the attestations - mentally insert these as necessary.

We begin with a justified checkpoint AA that all nodes agree on.

Due to the network partition, only 49% of validators, plus the adversary's 18%, see checkpoint BB. They all make Casper FFG votes [AB][A \rightarrow B], thereby justifying BB. A further checkpoint C1C_1 is produced on this branch, and the 49% that are honest validators dutifully make the Casper FFG vote [BC1][B \rightarrow C_1], but the adversary does not, meaning that C1C_1 is not justified. Validators on this branch see h1h_1 as the head block, and have a highest justified checkpoint of BB.

A diagram illustrating the first step in a liveness attack on the unfiltered chain, making the first branch

The large blocks represent checkpoints. After checkpoint AA there is a network partition: 49% of validators plus the adversary see checkpoints BB and C1C_1. Casper votes are shown by the dashed arrows. The adversary votes for BB, but not for C1C_1.

The remaining 33% of validators do not see checkpoint BB, but see C2C_2 instead and make Casper FFG votes [AC2][A \rightarrow C_2] for it. But this is not enough votes to justify C2C_2. Checkpoint D2D_2 is produced on top of C2C_2, and a further block h2h_2. On this branch, h2h_2 is the head of the chain according to LMD GHOST, and AA remains the highest justified checkpoint.

A diagram illustrating the second step in a liveness attack on the unfiltered chain, making the second branch

Meanwhile, the remaining 33% of validators do not see the branch starting at BB, but start a new branch containing C2C_2 and its descendants. They do not have enough collective weight to justify any of the checkpoints.

Now for the cunning part. The adversary switches its LMD GHOST vote (and implicitly its Casper FFG vote, although that does not matter for this exercise) from the first branch to the second branch, and lets the validators in the first branch see the blocks and votes on the second branch.

Block h2h_2 now has votes from the majority of validators – 33% plus the adversary's 18% – so all honest validators should make it their head block.

However, the justified checkpoint on the h2h_2 branch remains at AA. This means that the 49% of validators who made Casper FFG vote [BC][B \rightarrow C] cannot switch their chain head from h1h_1 to h2h_2 without committing a Casper FFG surround vote, and thereby getting slashed. Switching branch would cause their highest justified checkpoint to go backwards. Since they have previously voted [BC1][B \rightarrow C_1], they cannot now vote [AX][A \rightarrow X] where XX has a height greater than C1C_1, which they must do if they were to switch to the h2h_2 branch.

A diagram illustrating the third step in a liveness attack on the unfiltered chain, changing the chain head

The adversary switches to the second branch, giving h2h_2 the majority LMD GHOST vote. This deadlocks finalisation: the 49% who made Casper FFG vote [BC1][B \rightarrow C_1] cannot switch to h2h_2 without being slashed.

In conclusion, the chain can no longer finalise (by creating higher justified checkpoints) without a substantial proportion of validators (at least 16%) being willing to get themselves slashed.

It should never be possible for the chain to get into a situation in which honest validators, following the rules of the protocol, end up in danger of being slashed. The situation here arises due to a conflict between the Casper FFG fork choice (follow the chain containing the justified checkpoint of the greatest height) and the LMD GHOST fork choice (which, in this instance, ignores that rule). It is a symptom of the clunky way in which the two have been bolted together.

The chosen fix for all this is to filter the block tree before applying the LMD GHOST fork choice, so as to remove all "unviable" branches from consideration. That is, all branches whose head block's state does not agree with me about the current state of justification and finalisation.

A diagram showing that filter block tree prunes out the conflicting branch for validators following the first branch

When validators that followed branch 1 apply filter_block_tree(), branch 2 is pruned out (as indicated by the dashed lines). This is because their Store has BB as the best justified checkpoint, while branch 2's leaf block has a state with AA as the justified checkpoint. For these validators h2h_2 is no longer a candidate head block.

With this fix, the chain will recover the ability to finalise when the validators on the second branch eventually become aware of the first branch. On seeing h1h_1 and its ancestors, they will update their Stores' justified checkpoints to BB and mark the h2h_2 branch unviable.

Used by get_filtered_block_tree(), filter_block_tree()
Uses filter_block_tree()

get_filtered_block_tree

def get_filtered_block_tree(store: Store) -> Dict[Root, BeaconBlock]:
    """
    Retrieve a filtered block tree from ``store``, only returning branches
    whose leaf state's justified/finalized info agrees with that in ``store``.
    """
    base = store.justified_checkpoint.root
    blocks: Dict[Root, BeaconBlock] = {}
    filter_block_tree(store, base, blocks)
    return blocks

A convenience wrapper that passes the Store's justified checkpoint to filter_block_tree(). On returning, the blocks dictionary structure will contain all viable branches rooted at that checkpoint, and nothing that does not descend from that checkpoint. For the meaning of "viable", see filter_block_tree().

Used by get_head()
Uses filter_block_tree()

get_head

def get_head(store: Store) -> Root:
    # Get filtered block tree that only includes viable branches
    blocks = get_filtered_block_tree(store)
    # Execute the LMD-GHOST fork choice
    head = store.justified_checkpoint.root
    while True:
        children = [
            root for root in blocks.keys()
            if blocks[root].parent_root == head
        ]
        if len(children) == 0:
            return head
        # Sort by latest attesting balance with ties broken lexicographically
        # Ties broken by favoring block with lexicographically higher root
        head = max(children, key=lambda root: (get_latest_attesting_balance(store, root), root))

get_head() encapsulates the fork choice rule: given a Store it returns a head block.

The fork choice rule is objective in that, given the same Store, it will always return the same head block. But the overall process is subjective in that each node on the network will tend to have a different view, that is, a different Store, due to delays in receiving attestations or blocks, or having seen different sets of attestations or blocks because of network asynchrony or an attack.

Looking first at the while True loop, this implements LMD GHOST in its purest form. Starting from a given block (which would be the genesis block in unmodified LMD GHOST), we find the weights of the children of that block. We choose the child block with the largest weight and repeat the process until we end up at a leaf block (the tip of a branch). That is, we Greedily take the Heaviest Observed Sub-Tree, GHOST. Any tie between two child blocks with the same weight is broken by comparing their block hashes, so we end up at a unique leaf block - the head that we return.

Diagram of a block tree showing the weight of each block