One Page Annotated Spec

Note: This page is automatically generated from the chapters in Part 3. You may find that some internal links are broken.

Introduction

The beacon chain specification is the guts of the machine. Like the guts of a computer, all the components are showing and the wires are hanging out: everything is on display. In the course of the next sections I will be dissecting the entire core beacon chain specification line by line. My aim is not only to explain how things work, but also to give some historical context: some of the reasoning behind how we ended up where we are today.

Early versions of the specs were written with much more narrative and explanation than today's. Over time, they were coded up in Python for better precision and the benefits of being executable. However, in that process, most of the explanation and intuition was removed.1 Vitalik has created his own annotated specifications that covers many of the key insights. It's hard to compete with Vitalik, but my intention here is to go one level deeper in thoroughness and detail. And perhaps to give an independent perspective.

As and when other parts of the book get written I will add links to the specific chapters on each topic (for example on Simple Serialize, consensus, networking).

Note that the online annotated specification is available in two forms:

  • divided into chapters in Part 3 of the main book, and
  • as a standalone single page that's useful for searching.

The contents of each are identical.

Version information

This edition of Upgrading Ethereum is based on the Capella version of the beacon chain specification, and corresponds to Release v1.3.0, made on the 18th of April, 2023.

There is no single specification document that covers Capella. Rather, we have the Phase 0 specification, the Altair specification changes, the Bellatrix specification changes, and the Capella specification changes. Each builds on top of the previous version in a kind of text-based diff. In addition, these documents are not stable between upgrades. For example, the Phase 0 specs were updated as part of the Capella release. This can all be rather confusing and difficult to track.

To make the whole thing easier to follow in this chapter, I have consolidated all of the specifications to date, (mostly) omitting parts that have been superseded2. In general, I have tried to reflect the existing structure of the documents to make them easier to read side-by-side with the original specs. However, I have included the separate BLS document into the flow of this one.

See also

In addition to the spec documents referenced above, a few other current and historical documents exist.

Hsiao-Wei Wang gave a Lightning Talk on the consensus Pyspec at Devcon VI that briefly describes its structure and how it can be executed.

Types, Constants, Presets, and Configuration

Preamble

For some, a chapter on constants, presets and parameters will seem drier than the Namib Desert, but I've long found these to be a rich and fertile way in to the ideas and mechanisms we'll be unpacking in detail in later chapters. Far from being a desert, this part of the spec bustles with life.

The foundation is laid with a set of custom data types. The beacon chain specification is executable in Python; the data types defined at the top of the spec represent the fundamental quantities that will reappear frequently.

Then – with constants, presets, and parameters – we will examine the numbers that define and constrain the behaviour of the chain. Each of these quantities tells a story. Each parameter encapsulates an insight, or a mechanism, or a compromise. Why is it here? How has it changed over time? Where does its value come from?

Custom Types

The specification defines the following Python custom types, "for type hinting and readability": the data types defined here appear frequently throughout the spec; they are the building blocks for everything else.

Each type has a name, an "SSZ equivalent", and a description. SSZ is the encoding method used to pass data between clients, among other things. Here it can be thought of as just a primitive data type.

Throughout the spec, (almost) all integers are unsigned 64-bit numbers, uint64, but this hasn't always been the case.

Regarding "unsigned", there was much discussion around whether Eth2 should use signed or unsigned integers, and eventually unsigned was chosen. As a result, it is critical to preserve the order of operations in some places to avoid inadvertently causing underflows since negative numbers are forbidden.

And regarding "64-bit", early versions of the spec used other bit lengths than 64 (a "premature optimisation"), but arithmetic integers are now standardised at 64 bits throughout the spec, the only exception being ParticipationFlags, introduced in the Altair upgrade, which has type uint8, and is really a byte type.

Name SSZ equivalent Description
Slot uint64 a slot number
Epoch uint64 an epoch number
CommitteeIndex uint64 a committee index at a slot
ValidatorIndex uint64 a validator registry index
Gwei uint64 an amount in Gwei
Root Bytes32 a Merkle root
Hash32 Bytes32 a 256-bit hash
Version Bytes4 a fork version number
DomainType Bytes4 a domain type
ForkDigest Bytes4 a digest of the current fork data
Domain Bytes32 a signature domain
BLSPubkey Bytes48 a BLS12-381 public key
BLSSignature Bytes96 a BLS12-381 signature
ParticipationFlags uint8 a succinct representation of 8 boolean participation flags
Transaction ByteList[MAX_BYTES_PER_TRANSACTION] either a typed transaction envelope or a legacy transaction
ExecutionAddress Bytes20 Address of account on the execution layer
WithdrawalIndex uint64 an index of a Withdrawal

Slot

Time is divided into fixed length slots. Within each slot, exactly one validator is randomly selected to propose a beacon chain block. The progress of slots is the fundamental heartbeat of the beacon chain.

Epoch

Sequences of slots are combined into fixed-length epochs.

Epoch boundaries are the points at which the chain can be justified and finalised (by the Casper FFG mechanism). They are also the points at which validator balances are updated, validator committees get shuffled, and validator exits, entries, and slashings are processed. That is, the main state-transition work is performed per epoch, not per slot.

Epochs have always felt like a slightly uncomfortable overlay on top of the slot-by-slot progress of the beacon chain, but necessitated by Casper FFG finality. There have been proposals to move away from epochs, and there are possible future developments that could allow us to do away with epochs entirely. But, for the time being, they remain.

Fun fact: Epochs were originally called Cycles.

CommitteeIndex

Validators are organised into committees that collectively vote (make attestations) on blocks. Each committee is active at exactly one slot per epoch, but several committees are active at each slot. The CommitteeIndex type is an index into the list of committees active at a slot.

The beacon chain's committee-based design is a large part of what makes it practical to implement while maintaining security. If all validators were active all the time, there would be an overwhelming number of messages to deal with. The random shuffling of committees make them very hard to subvert by an attacker without a supermajority of stake.

ValidatorIndex

Each validator making a successful deposit is consecutively assigned a unique validator index number that is permanent, remaining even after the validator exits. It is permanent because the validator's balance is associated with its index, so the data needs to be preserved when the validator exits, at least until the balance is withdrawn at an unknown future time.

Gwei

All Ether amounts on the consensus layer are specified in units of Gwei (10910^9 Wei, 10910^{-9} Ether). This is basically a hack to avoid having to use integers wider than 64 bits to store validator balances and while doing calculations, since 2642^{64} Wei is only 18 Ether. Even so, in some places care needs to be taken to avoid arithmetic overflow when dealing with Ether calculations.

Root

Merkle roots are ubiquitous in the Eth2 protocol. They are a very succinct and tamper-proof way of representing a lot of data, an example of a cryptographic accumulator. Blocks are summarised by their Merkle roots; state is summarised by its Merkle root; the list of Eth1 deposits is summarised by its Merkle root; the digital signature of a message is calculated from the Merkle root of the data structure contained within the message.

Hash32

Merkle roots are constructed with cryptographic hash functions. In the spec, a Hash32 type is used to represent Eth1 block roots (which are also Merkle roots).

I don't know why only the Eth1 block hash has been awarded the Hash32 type: other hashes in the spec remain Bytes32. In early versions of the spec Hash32 was used for all cryptographic has quantities, but this was changed to Bytes32.

Anyway, it's worth taking a moment in appreciation of the humble cryptographic hash function. The hash function is arguably the single most important algorithmic innovation underpinning blockchain technology, and in fact most of our online lives. Easily taken for granted, but utterly critical in enabling our modern world.

Version

Unlike Ethereum 13, the beacon chain has an in-protocol concept of a version number. It is expected that the protocol will be updated/upgraded from time to time, a process commonly known as a "hard-fork". For example, the upgrade from Phase 0 to Altair took place on the 27th of October 2021, and was assigned its own fork version. Similarly, the upgrade from Altair to Bellatrix was assigned a different fork version.

Version is used when computing the ForkDigest.

DomainType

DomainType is just a cryptographic nicety: messages intended for different purposes are tagged with different domains before being hashed and possibly signed. It's a kind of name-spacing to avoid clashes; probably unnecessary, but considered a best-practice. Eleven domain types are defined in Capella.

ForkDigest

ForkDigest is the unique chain identifier, generated by combining information gathered at genesis with the current chain Version identifier.

The ForkDigest serves two purposes.

  1. Within the consensus protocol to prevent, for example, attestations from validators on one fork (that maybe haven't upgraded yet) being counted on a different fork.
  2. Within the networking protocol to help to distinguish between useful peers that on the same chain, and useless peers that are on a different chain. This usage is described in the Ethereum 2.0 networking specification, where ForkDigest appears frequently.

Specifically, ForkDigest is the first four bytes of the hash tree root of the ForkData object containing the current chain Version and the genesis_validators_root which was created at beacon chain initialisation. It is computed in compute_fork_digest().

Domain

Domain is used when verifying protocol messages validators. To be valid, a message must have been combined with both the correct domain and the correct fork version. It calculated as the concatenation of the four byte DomainType and the first 28 bytes of the fork data root.

BLSPubkey

BLS (Boneh-Lynn-Shacham) is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol. Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 (Barreto-Lynn-Scott) elliptic curve.

The BLSPubkey type holds a validator's public key, or the aggregation of several validators' public keys. This is used to verify messages that are claimed to have come from that validator or group of validators.

In Ethereum 2.0, BLS public keys are elliptic curve points from the BLS12-381 G1G_1 group, thus are 48 bytes long when compressed.

See the section on BLS signatures in part 2 for a more in-depth look at these things.

BLSSignature

As above, we are using BLS signatures over the BLS12-381 elliptic curve in order to sign messages between participants. As with all digital signature schemes, this guarantees both the identity of the sender and the integrity of the contents of any message.

In Ethereum 2.0, BLS signatures are elliptic curve points from the BLS12-381 G2G_2 group, thus are 96 bytes long when compressed.

ParticipationFlags

The ParticipationFlags type was introduced in the Altair upgrade as part of the accounting reforms.

Prior to Altair, all attestations seen in blocks were stored in state for two epochs. At the end of an epoch, finality calculations, and reward and penalty calculations for each active validator, would be done by processing all the attestations for the previous epoch as a batch. This created a spike in processing at epoch boundaries, and led to a noticeable increase in late blocks and attestations during the first slots of epochs. With Altair, participation flags are now used to continuously track validators' attestations, reducing the processing load at the end of epochs.

Three of the eight bits are currently used; five are reserved for future use.

As an aside, it might have been more intuitive if ParticipationFlags were a Bytes1 type, rather than introducing a weird uint8 into the spec. After all, it is not used as an arithmetic integer. However, Bytes1 is a composite type in SSZ, really an alias for Vector[uint8, 1], whereas uint8 is a basic type. When computing the hash tree root of a List type, multiple basic types can be packed into a single leaf, while composite types take a leaf each. This would result in 32 times as many hashing operations for a list of Bytes1. For similar reasons the type of ParticipationFlags was changed from bitlist to uint8.

Transaction

The Transaction type was introduced in the Bellatrix pre-Merge upgrade to allow for Ethereum transactions to be included in beacon blocks. It appears in ExecutionPayload objects.

Transactions are completely opaque to the beacon chain and are exclusively handled in the execution layer. A note reflecting this is included in the Bellatrix specification:

Note: The Transaction type is a stub which is not final.

The maximum size of a transaction is MAX_BYTES_PER_TRANSACTION which looks huge, but since the underlying type is an SSZ ByteList (which is a List), a Transaction object will only occupy as much space as necessary.

ExecutionAddress

The ExecutionAddress type was introduced in the Bellatrix pre-Merge upgrade to represent the fee recipient on the execution chain for beacon blocks that contain transactions. It is a normal, 20-byte, Ethereum address, and is used in the ExecutionPayload class.

WithdrawalIndex

The WithdrawalIndex keeps track of the total number of withdrawal transactions made from the consensus layer to the execution layer. All nodes store this number in their state, so a block containing withdrawal transactions that have unexpected withdrawal indices is invalid.

At the maximum rate of 16 withdrawals per slot, a uint64 will take 438 billion years to overflow. This ought to be enough.

References

Constants

The distinction between "constants", "presets", and "configuration values" is not always clear, and things have moved back and forth between the sections at times4. In essence, "constants" are things that are expected never to change for the beacon chain, no matter what fork or test network it is running.

Miscellaneous

Name Value
GENESIS_SLOT Slot(0)
GENESIS_EPOCH Epoch(0)
FAR_FUTURE_EPOCH Epoch(2**64 - 1)
DEPOSIT_CONTRACT_TREE_DEPTH uint64(2**5) (= 32)
JUSTIFICATION_BITS_LENGTH uint64(4)
PARTICIPATION_FLAG_WEIGHTS [TIMELY_SOURCE_WEIGHT, TIMELY_TARGET_WEIGHT, TIMELY_HEAD_WEIGHT]
ENDIANNESS 'little'
GENESIS_SLOT

The very first slot number for the beacon chain is zero.

Perhaps this seems uncontroversial, but it actually featured heavily in the Great Signedness Wars mentioned previously. The issue was that calculations on unsigned integers might have negative intermediate values, which would cause problems. A proposed work-around for this was to start the chain at a non-zero slot number. It was initially set to 2^19, then 2^63, then 2^32, and finally back to zero. In my humble opinion, this madness only confirms that we should have been using signed integers all along.

GENESIS_EPOCH

As above. When the chain starts, it starts at epoch zero.

FAR_FUTURE_EPOCH

A candidate for the dullest constant. It's used as a default initialiser for validators' activation and exit times before they are properly set. No epoch number will ever be bigger than this one.

DEPOSIT_CONTRACT_TREE_DEPTH

DEPOSIT_CONTRACT_TREE_DEPTH specifies the size of the (sparse) Merkle tree used by the Eth1 deposit contract to store deposits made. With a value of 32, this allows for 2322^{32} = 4.3 billion deposits. Given that the minimum deposit it 1 Ether, that number is clearly enough.

Since deposit receipts contain Merkle proofs, their size depends on the value of this constant.

JUSTIFICATION_BITS_LENGTH

As an optimisation to Casper FFG – the process by which finality is conferred on epochs – the beacon chain uses a "kk-finality" rule. We will describe this more fully when we look at processing justification and finalisation. For now, this constant is just the number of bits we need to store in state to implement kk-finality. With k=2k = 2, we track the justification status of the last four epochs.

PARTICIPATION_FLAG_WEIGHTS

This array is just a convenient way to access the various weights given to different validator achievements when calculating rewards. The three weights are defined under incentivization weights, and each weight corresponds to a flag stored in state and defined under participation flag indices.

ENDIANNESS

Endianness refers to the order of bytes in the binary representation of a number: most-significant byte first is big-endian; least-significant byte first is little-endian. For the most part, these details are hidden by compilers, and we don't need to worry about endianness. But endianness matters when converting between integers and bytes, which is relevant to shuffling and proposer selection, the RANDAO, and when serialising with SSZ.

The spec began life as big-endian, but the Nimbus team from Status successfully lobbied for it to be changed to little-endian in order to better match processor hardware implementations, and the endianness of WASM. SSZ was changed first, and then the rest of the spec followed.

Participation flag indices

Name Value
TIMELY_SOURCE_FLAG_INDEX 0
TIMELY_TARGET_FLAG_INDEX 1
TIMELY_HEAD_FLAG_INDEX 2

Validators making attestations that get included on-chain are rewarded for three things:

  • getting attestations included with the correct source checkpoint within 5 slots (integer_squareroot(SLOTS_PER_EPOCH));
  • getting attestations included with the correct target checkpoint within 32 slots (SLOTS_PER_EPOCH); and,
  • getting attestations included with the correct head within 1 slot (MIN_ATTESTATION_INCLUSION_DELAY), basically immediately.

These flags are temporarily recorded in the BeaconState when attestations are processed, then used at the ends of epochs to update finality and to calculate validator rewards for making attestations.

The mechanism for rewarding timely inclusion of attestations (thus penalising late attestations) differs between Altair and Phase 0. In Phase 0, attestations included within 32 slots would receive the full reward for the votes they got correct (source, target, head), plus a declining reward based on the delay in inclusion: 12\frac{1}{2} for a two slot delay, 13\frac{1}{3} for a three slot delay, and so on. With Altair, for each vote, we now have a cliff before which the validator receives the full reward and after which a penalty. The cliffs differ in duration, which is intended to more accurately target incentives at behaviours that genuinely help the chain (there is little value in rewarding a correct head vote made 30 slots late, for example). See get_attestation_participation_flag_indices() for how this is implemented in code.

Incentivization weights

Name Value
TIMELY_SOURCE_WEIGHT uint64(14)
TIMELY_TARGET_WEIGHT uint64(26)
TIMELY_HEAD_WEIGHT uint64(14)
SYNC_REWARD_WEIGHT uint64(2)
PROPOSER_WEIGHT uint64(8)
WEIGHT_DENOMINATOR uint64(64)

These weights are used to calculate the reward earned by a validator for performing its duties. There are five duties in total. Three relate to making attestations: attesting to the source epoch, attesting to the target epoch, and attesting to the head block. There are also rewards for proposing blocks, and for participating in sync committees. Note that the sum of the five weights is equal to WEIGHT_DENOMINATOR.

On a long-term average, a validator can expect to earn a total amount of get_base_reward() per epoch, with these weights being the relative portions for each of the duties comprising that total. Proposing blocks and participating in sync committees do not happen in every epoch, but are randomly assigned, so over small periods of time validator earnings may differ from get_base_reward().

The apportioning of rewards was overhauled in the Altair upgrade to better reflect the importance of each activity within the protocol. The total reward amount remains the same, but sync committee rewards were added, and the relative weights were adjusted. Previously, the weights corresponded to 16 for correct source, 16 for correct target, 16 for correct head, 14 for inclusion (equivalent to correct source), and 2 for block proposals. The factor of four increase in the proposer reward addressed a long-standing spec bug.

A piechart of the proportion of a validator's total reward derived from each of the micro-rewards.

The proportion of the total reward derived from each of the micro-rewards.

Withdrawal Prefixes

Name Value
BLS_WITHDRAWAL_PREFIX Bytes1('0x00')
ETH1_ADDRESS_WITHDRAWAL_PREFIX Bytes1('0x01')

Withdrawal prefixes relate to the withdrawal credentials provided when deposits are made for validators.

Two ways to specify the withdrawal credentials are currently available, versioned with these prefixes, with others such as 0x02 and 0x03 under discussion.

When processing deposits onto the consensus layer, the withdrawal_credential of the deposit is not checked in any way. It's up to the depositor to ensure that they are using the correct prefix and contents to be able to receive their rewards and retrieve their stake back after exiting the consensus layer. This also means that we can potentially introduce new types of withdrawal credentials at any time, enabling them later with a hard fork, just as we did with 0x01 credentials ahead of the Capella upgrade that began using them.

BLS_WITHDRAWAL_PREFIX

The beacon chain launched with only BLS-style withdrawal credentials available, so all early stakers used this.

It was not at all clear in the early days what accounts on Ethereum 2.0 would look like, and what addressing scheme they might use. The 0x00 credential created a placeholder or commitment to a future withdrawal credential change.

With this type of credential, in addition to a BLS signing key, stakers have a second BLS "withdrawal" key. Since the Capella upgrade, stakers have been able to use their withdrawal key to sign a message instructing the consensus layer to change its withdrawal credential from type 0x00 to type 0x01.

The credential registered in the deposit data is the 32 byte SHA256 hash of the validator's withdrawal public key, with the first byte set to 0x00 (BLS_WITHDRAWAL_PREFIX).

ETH1_ADDRESS_WITHDRAWAL_PREFIX

Eth1 withdrawal credentials are much simpler, and were adopted once it became clear that Ethereum 2.0 would not be using a BLS-based address scheme for accounts at any time soon. The Capella upgrade enables automatic partial and full withdrawals of validators' balances from the beacon chain to normal Ethereum accounts and wallets.

An Eth1 withdrawal credential looks like the byte 0x01 (ETH1_ADDRESS_WITHDRAWAL_PREFIX), followed by eleven 0x00 bytes, followed by the 20-byte Ethereum address of the destination account.

In addition to enabling withdrawal transactions for validators having 0x01 Eth1 credentials, the Capella upgrade gives validators with old 0x00 BLS style credentials the opportunity to make a one-time change from BLS to Eth1 withdrawal credentials5.

Domain types

Name Value
DOMAIN_BEACON_PROPOSER DomainType('0x00000000')
DOMAIN_BEACON_ATTESTER DomainType('0x01000000')
DOMAIN_RANDAO DomainType('0x02000000')
DOMAIN_DEPOSIT DomainType('0x03000000')
DOMAIN_VOLUNTARY_EXIT DomainType('0x04000000')
DOMAIN_SELECTION_PROOF DomainType('0x05000000')
DOMAIN_AGGREGATE_AND_PROOF DomainType('0x06000000')
DOMAIN_SYNC_COMMITTEE DomainType('0x07000000')
DOMAIN_SYNC_COMMITTEE_SELECTION_PROOF DomainType('0x08000000')
DOMAIN_CONTRIBUTION_AND_PROOF DomainType('0x09000000')
DOMAIN_BLS_TO_EXECUTION_CHANGE DomainType('0x0A000000')

These domain types are used in three ways: for seeds, for signatures, and for selecting aggregators.

As seeds

When random numbers are required in-protocol, one way they are generated is by hashing the RANDAO mix with other quantities, one of them being a domain type (see get_seed()). The original motivation was to avoid occasional collisions between Phase 0 committees and Phase 1 persistent committees, back when they were a thing. So, when computing the beacon block proposer, DOMAIN_BEACON_PROPOSER is hashed into the seed, when computing attestation committees, DOMAIN_BEACON_ATTESTER is hashed in, and when computing sync committees, DOMAIN_SYNC_COMMITTEE is hashed in.

See the Randomness chapter for more information.

As signatures

In addition, as a cryptographic nicety, each of the protocol's signature types is augmented with the appropriate domain before being signed:

  • Signed block proposals incorporate DOMAIN_BEACON_PROPOSER
  • Signed attestations incorporate DOMAIN_BEACON_ATTESTER
  • RANDAO reveals are BLS signatures, and use DOMAIN_RANDAO
  • Deposit data messages incorporate DOMAIN_DEPOSIT
  • Validator voluntary exit messages incorporate DOMAIN_VOLUNTARY_EXIT
  • Sync committee signatures incorporate DOMAIN_SYNC_COMMITTEE
  • BLS withdrawal credential change messages incorporate DOMAIN_BLS_TO_EXECUTION_CHANGE

For most of these, the fork version is also incorporated before signing. This allows validators to participate, if they wish, in two independent forks of the beacon chain without fear of being slashed.

However, the user-signed messages for deposits (DOMAIN_DEPOSIT) and for BLS withdrawal credential changes (DOMAIN_BLS_TO_EXECUTION_CHANGE) do not incorporate the fork version when signed. This makes them valid across all forks, which is a usability enhancement.

Voluntary exit messages (DOMAIN_VOLUNTARY_EXIT) are a bit of an anomaly in that they are user signed but also incorporate the fork version, meaning that they expire after two upgrades (voluntary exit messages signed in Phase 0 or Altair are no longer valid in Capella). There is some discussion about making voluntary exits non-expiring in future.

See the BLS signatures chapter for more information.

Aggregator selection

The remaining four types, suffixed _PROOF are not used directly in the beacon chain specification. They were introduced to implement attestation subnet validations for denial of service resistance. The technique was extended to sync committees with the Altair upgrade.

Briefly, at each slot, validators are selected to aggregate attestations from their committees. The selection is done based on the validator's signature over the slot number, mixing in DOMAIN_SELECTION_PROOF. The validator then signs the whole aggregated attestation, including the previous signature as proof that it was selected to be a validator, using DOMAIN_AGGREGATE_AND_PROOF. And similarly for sync committees. In this way, everything is verifiable and attributable, making it hard to flood the network with fake messages.

These four are not part of the consensus-critical state-transition, but are nonetheless important to the healthy functioning of the chain.

This mechanism is described in the Phase 0 honest validator spec for attestation aggregation, and in the Altair honest validator spec for sync committee aggregation.

See the Aggregator Selection chapter for more information.

Crypto

Name Value
G2_POINT_AT_INFINITY BLSSignature(b'\xc0' + b'\x00' * 95)

This is the compressed serialisation of the "point at infinity", the identity point, of the G2 group of the BLS12-381 curve that we are using for signatures. Note that it is in big-endian format (unlike all other constants in the spec).

It was introduced as a convenience when verifying aggregate signatures that contain no public keys in eth_fast_aggregate_verify(). The underlying FastAggregateVerify function from the BLS signature standard would reject these.

G2_POINT_AT_INFINITY is described in the separate BLS Extensions document, but included here for convenience.

Preset

The "presets" are consistent collections of configuration variables that are bundled together. The specs repo currently defines two sets of presets, mainnet and minimal. The mainnet configuration is running in production on the beacon chain; minimal is often used for testing. Other configurations are possible. For example, Teku uses a swift configuration for acceptance testing.

All the values discussed below are from the mainnet configuration.

You'll notice that most of these values are powers of two. There's no huge significance to this. Computer scientists think it's neat, and it ensures that things cleanly divide other things in general. There is a view that this practice helps to minimise bike-shedding (endless arguments over trivial matters).

Some of the configuration parameters below are quite technical and perhaps obscure. I'll take the opportunity here to introduce some concepts, and give more detailed explanations when they appear in later chapters.

Misc

Name Value
MAX_COMMITTEES_PER_SLOT uint64(2**6) (= 64)
TARGET_COMMITTEE_SIZE uint64(2**7) (= 128)
MAX_VALIDATORS_PER_COMMITTEE uint64(2**11) (= 2,048)
SHUFFLE_ROUND_COUNT uint64(90)
MAX_COMMITTEES_PER_SLOT

Validators are organised into committees to do their work. At any one time, each validator is a member of exactly one beacon chain committee, and is called on to make an attestation exactly once per epoch. An attestation is a vote for, or a statement of, the validator's view of the chain at that point in time.

On the beacon chain, up to 64 committees are active in a slot and effectively act as a single committee as far as the fork-choice rule is concerned. They all vote on the proposed block for the slot, and their votes/attestations are pooled. In a similar way, all committees active during an epoch (that is, the whole active validator set) act effectively as a single committee as far as justification and finalisation are concerned.

The number 64 was intended to map to one committee per shard once data shards were deployed in the now abandoned Phase 1 of the Ethereum 2.0 roadmap. The plan was for each committee to also vote on one shard crosslink, for a total of 64 shards. We are no longer going down that path, but the committees remain at each slot.

All the above is discussed further in the section on Committees.

Note that sync committees are a different thing: there is only one sync committee active at any time.

TARGET_COMMITTEE_SIZE

To achieve a desirable level of security, committees need to be larger than a certain size. This makes it infeasible for an attacker to randomly end up with a super-majority in a committee even if they control a significant number of validators. The target here is a kind of lower-bound on committee size. If there are not enough validators for all committees to have at least 128 members, then, as a first measure, the number of committees per slot is reduced to maintain this minimum. Only if there are fewer than SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE = 4096 validators in total will the committee size be reduced below TARGET_COMMITTEE_SIZE. With so few validators, the system would be insecure in any case.

For further discussion and an explanation of how the value of TARGET_COMMITTEE_SIZE was set, see the section on committees.

MAX_VALIDATORS_PER_COMMITTEE

This is just used for sizing some data structures, and is not particularly interesting. Reaching this limit would imply over 4 million active validators, staked with a total of 128 million Ether, which exceeds the total supply today.

SHUFFLE_ROUND_COUNT

The beacon chain implements a rather interesting way of shuffling validators in order to select committees, called the "swap-or-not shuffle". This shuffle proceeds in rounds, and the degree of shuffling is determined by the number of rounds, SHUFFLE_ROUND_COUNT. The time taken to shuffle is linear in the number of rounds, so for light-weight, non-mainnet configurations, the number of rounds can be reduced.

The value 90 was introduced in Vitalik's initial commit without explanation. The original paper describing the shuffling technique seems to suggest that a cryptographically safe number of rounds is 6logN6\log{N}. With 90 rounds, then, we should be good for shuffling 3.3 million validators, which is close to the maximum number possible (given the Ether supply).

Hysteresis parameters

Name Value
HYSTERESIS_QUOTIENT uint64(4)
HYSTERESIS_DOWNWARD_MULTIPLIER uint64(1)
HYSTERESIS_UPWARD_MULTIPLIER uint64(5)

The parameters prefixed HYSTERESIS_ control the way that effective balance is changed (see EFFECTIVE_BALANCE_INCREMENT). As described there, the effective balance of a validator follows changes to the actual balance in a step-wise way, with hysteresis applied. This ensures that the effective balance does not change often.

The original hysteresis design had an unintended effect that might have encouraged stakers to over-deposit or make multiple deposits in order to maintain a balance above 32 Ether at all times. If a validator's balance were to drop below 32 Ether soon after depositing, however briefly, the effective balance would have immediately dropped to 31 Ether and taken a long time to recover. This would have resulted in a 3% reduction in rewards for a period.

This problem was addressed by making the hysteresis configurable via these parameters. Specifically, these settings mean:

  1. if a validators' balance falls 0.25 Ether below its effective balance, then its effective balance is reduced by 1 Ether
  2. if a validator's balance rises 1.25 Ether above its effective balance, then its effective balance is increased by 1 Ether

These calculations are done in process_effective_balance_updates() during end of epoch processing.

Gwei values

Name Value
MIN_DEPOSIT_AMOUNT Gwei(2**0 * 10**9) (= 1,000,000,000)
MAX_EFFECTIVE_BALANCE Gwei(2**5 * 10**9) (= 32,000,000,000)
EFFECTIVE_BALANCE_INCREMENT Gwei(2**0 * 10**9) (= 1,000,000,000)
MIN_DEPOSIT_AMOUNT

MIN_DEPOSIT_AMOUNT is not actually used anywhere within the beacon chain specification document. Rather, it is enforced in the deposit contract that was deployed to the Ethereum 1 chain. Any amount less than this value sent to the deposit contract is reverted.

Allowing stakers to make deposits smaller than a full stake is useful for topping-up a validator's balance if its effective balance has dropped below 32 Ether in order to maintain full productivity. However, this actually led to a vulnerability for some staking pools, involving the front-running of deposits. In some circumstances, a front-running attacker could change a genuine depositor's withdrawal credentials to their own.

MAX_EFFECTIVE_BALANCE

There is a concept of "effective balance" for validators: whatever a validator's total balance, its voting power is weighted by its effective balance, even if its actual balance is higher. Effective balance is also the amount on which all rewards, penalties, and slashings are calculated - it's used a lot in the protocol

The MAX_EFFECTIVE_BALANCE is the highest effective balance that a validator can have: 32 Ether. Any balance above this is ignored. Note that this means that staking rewards don't compound in the usual case (unless a validator's effective balance somehow falls below 32 Ether, in which case rewards kind of compound).

There is a discussion in the Design Rationale of why 32 Ether was chosen as the staking amount. In short, we want enough validators to keep the chain both alive and secure under attack, but not so many that the message overhead on the network becomes too high.

EFFECTIVE_BALANCE_INCREMENT

Throughout the protocol, a quantity called "effective balance" is used instead of the validators' actual balances. Effective balance tracks the actual balance, with two differences: (1) effective balance is capped at MAX_EFFECTIVE_BALANCE no matter how high the actual balance of a validator is, and (2) effective balance is much more granular - it changes only in steps of EFFECTIVE_BALANCE_INCREMENT rather than Gwei.

This discretisation of effective balance is intended to reduce the amount of hashing required when making state updates. The goal is to avoid having to re-calculate the hash tree root of validator records too often. Validators' actual balances, which change frequently, are stored as a contiguous list in BeaconState, outside validators' records. Effective balances are stored inside validators' individual records, which are more costly to update (more hashing required). So we try to update effective balances relatively infrequently.

Effective balance is changed according to a process with hysteresis to avoid situations where it might change frequently. See HYSTERESIS_QUOTIENT.

You can read more about effective balance in the Design Rationale and in this article.

Time parameters

Name Value Unit Duration
MIN_ATTESTATION_INCLUSION_DELAY uint64(2**0) (= 1) slots 12 seconds
SLOTS_PER_EPOCH uint64(2**5) (= 32) slots 6.4 minutes
MIN_SEED_LOOKAHEAD uint64(2**0) (= 1) epochs 6.4 minutes
MAX_SEED_LOOKAHEAD uint64(2**2) (= 4) epochs 25.6 minutes
MIN_EPOCHS_TO_INACTIVITY_PENALTY uint64(2**2) (= 4) epochs 25.6 minutes
EPOCHS_PER_ETH1_VOTING_PERIOD uint64(2**6) (= 64) epochs ~6.8 hours
SLOTS_PER_HISTORICAL_ROOT uint64(2**13) (= 8,192) slots ~27 hours
MIN_ATTESTATION_INCLUSION_DELAY

A design goal of Ethereum 2.0 is not to heavily disadvantage validators that are running on lower-spec systems, or, conversely, to reduce any advantage gained by running on high-spec systems.

One aspect of performance is network bandwidth. When a validator becomes the block proposer, it needs to gather attestations from the rest of its committee. On a low-bandwidth link, this takes longer, and could result in the proposer not being able to include as many past attestations as other better-connected validators might, thus receiving lower rewards.

MIN_ATTESTATION_INCLUSION_DELAY was an attempt to "level the playing field" by setting a minimum number of slots before an attestation can be included in a beacon block. It was originally set at 4, with a 6-second slot time, allowing 24 seconds for attestations to propagate around the network.

It was later set to one – attestations are included as early as possible – and MIN_ATTESTATION_INCLUSION_DELAY exists today as a relic of the earlier design. The current slot time of 12 seconds is assumed to allow sufficient time for attestations to propagate and be aggregated sufficiently within one slot.

SLOTS_PER_EPOCH

We currently have 12-second slots and 32-slot epochs. In earlier designs, slots were 6 seconds and there were 64 slots per epoch. So the time between epoch boundaries was unchanged when slots were lengthened.

The choice of 32 slots per epoch is a trade-off between time to finality (we need two epochs to finalise, so we prefer to keep them as short as we can) and being as certain as possible that at least one honest proposer per epoch will make a block to update the RANDAO (for which we prefer longer epochs).

In addition, epoch boundaries are where the heaviest part of the beacon chain state-transition calculation occurs, so that's another reason for not having them too close together.

Since every validator attests one every epoch, there is an interplay between the number of slots per epoch, the number of committees per slot, committee sizes, and the total number of validators.

MIN_SEED_LOOKAHEAD

A random seed is used to select all the committees and proposers for an epoch. During each epoch, the beacon chain accumulates randomness from proposers via the RANDAO and stores it. The seed for the current epoch is based on the RANDAO output from the epoch MIN_SEED_LOOKAHEAD + 1 ago. With MIN_SEED_LOOKAHEAD set to one, the effect is that we can know the seed for the current epoch and the next epoch, but not beyond, since the next-but-one epoch depends on randomness from the current epoch that hasn't been accumulated yet.

This mechanism is designed to allow sufficient time for members of newly formed committees to find each other on the peer-to-peer network, while preventing committee makeup being known too far ahead limits the opportunity for coordinated collusion between validators.

MAX_SEED_LOOKAHEAD

The above notwithstanding, if an attacker has a large proportion of the stake, or is, for example, able to DoS block proposers for a while, then it might be possible for the attacker to predict the output of the RANDAO further ahead than MIN_SEED_LOOKAHEAD would normally allow. This might enable the attacker to manipulate committee memberships to their advantage by performing well-timed exits and activations of their validators.

To prevent this, we assume a maximum feasible lookahead that an attacker might achieve (MAX_SEED_LOOKAHEAD) and delay all activations and exits by this amount, which allows new randomness to come in via block proposals from honest validators. With MAX_SEED_LOOKAHEAD set to 4, if only 10% of validators are online and honest, then the chance that an attacker can succeed in forecasting the seed beyond (MAX_SEED_LOOKAHEAD - MIN_SEED_LOOKAHEAD) = 3 epochs is 0.93×320.9^{3\times 32}, which is about 1 in 25,000.

MIN_EPOCHS_TO_INACTIVITY_PENALTY

The inactivity penalty is discussed below. This parameter sets the length of time until it kicks in. If the last finalised epoch is longer ago than MIN_EPOCHS_TO_INACTIVITY_PENALTY, then the beacon chain starts operating in "leak" mode. In this mode, participating validators no longer get rewarded, and validators that are not participating get penalised.

EPOCHS_PER_ETH1_VOTING_PERIOD

In order to safely onboard new validators, the beacon chain needs to take a view on what the Eth1 chain looks like. This is done by collecting votes from beacon block proposers - they are expected to consult an available Eth1 client in order to construct their vote.

EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH is the total number of votes for Eth1 blocks that are collected. As soon as half of this number of votes are for the same Eth1 block, that block is adopted by the beacon chain and deposit processing can continue. This processing is done in process_eth1_data().

Rules for how validators select the right block to vote for are set out in the validator guide. ETH1_FOLLOW_DISTANCE is the (approximate) minimum depth of block that can be considered.

This parameter was increased from 32 to 64 epochs for the beacon chain mainnet. This increase is intended to allow devs more time to respond if there is any trouble on the Eth1 chain, in addition to the eight hours grace provided by ETH1_FOLLOW_DISTANCE.

For a detailed analysis of these parameters, see this article.

SLOTS_PER_HISTORICAL_ROOT

There have been several redesigns of the way the beacon chain stores its past history. The current design is a double batched accumulator. The block root and state root for every slot are stored in the state for SLOTS_PER_HISTORICAL_ROOT slots. When those lists are full, each list is Merkleized separately, and their roots are added to the ever-growing state.historical_summaries list within an HistoricalSummary container.

State list lengths

The following parameters set the sizes of some lists in the beacon chain state. Some lists have natural sizes, others such as the validator registry need an explicit maximum size to guide SSZ serialisation.

Name Value Unit Duration
EPOCHS_PER_HISTORICAL_VECTOR uint64(2**16) (= 65,536) epochs ~0.8 years
EPOCHS_PER_SLASHINGS_VECTOR uint64(2**13) (= 8,192) epochs ~36 days
HISTORICAL_ROOTS_LIMIT uint64(2**24) (= 16,777,216) historical roots ~52,262 years
VALIDATOR_REGISTRY_LIMIT uint64(2**40)
(= 1,099,511,627,776)
validators -
EPOCHS_PER_HISTORICAL_VECTOR

This is the number of epochs of previous RANDAO mixes that are stored (one per epoch). Having access to past randao mixes allows historical shufflings to be recalculated. Since Validator records keep track of the activation and exit epochs of all past validators, we can reconstitute past committees as far back as we have the RANDAO values. This information can be used for slashing long-past attestations, for example. It is not clear how the value of this parameter was decided.

EPOCHS_PER_SLASHINGS_VECTOR

In the epoch in which a misbehaving validator is slashed, its effective balance is added to an accumulator in the state. In this way, the state.slashings list tracks the total effective balance of all validators slashed during the last EPOCHS_PER_SLASHINGS_VECTOR epochs.

At a time EPOCHS_PER_SLASHINGS_VECTOR // 2 after being slashed, a further penalty is applied to the slashed validator, based on the total amount of value slashed during the 4096 epochs before and the 4096 epochs after it was originally slashed.

The idea of this is to disproportionately punish coordinated attacks, in which many validators break the slashing conditions around the same time, while only lightly penalising validators that get slashed by making a mistake. Early designs for Eth2 would always slash a validator's entire deposit.

See also PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX.

HISTORICAL_ROOTS_LIMIT

Every SLOTS_PER_HISTORICAL_ROOT slots, the list of block roots and the list of state roots in the beacon state are Merkleized and added to state.historical_roots list. Although state.historical_roots is in principle unbounded, all SSZ lists must have maximum sizes specified. The size HISTORICAL_ROOTS_LIMIT will be fine for the next few millennia, after which it will be somebody else's problem. The list grows at less than 10 KB per year.

Storing past roots like this allows Merkle proofs to be constructed about anything in the beacon chain's history if required.

VALIDATOR_REGISTRY_LIMIT

Every time the Eth1 deposit contract processes a deposit from a new validator (as identified by its public key), a new entry is appended to the state.validators list.

In the current design, validators are never removed from this list, even after exiting from being a validator. This is largely because there is nowhere yet to send a validator's remaining deposit and staking rewards, so they continue to need to be tracked in the beacon chain.

The maximum length of this list is VALIDATOR_REGISTRY_LIMIT, which is one trillion, so we ought to be OK for a while, especially given that the minimum deposit amount is 1 Ether.

Rewards and penalties

Name Value
BASE_REWARD_FACTOR uint64(2**6) (= 64)
WHISTLEBLOWER_REWARD_QUOTIENT uint64(2**9) (= 512)
PROPOSER_REWARD_QUOTIENT uint64(2**3) (= 8)
INACTIVITY_PENALTY_QUOTIENT uint64(2**26) (= 67,108,864)
MIN_SLASHING_PENALTY_QUOTIENT uint64(2**7) (= 128)
PROPORTIONAL_SLASHING_MULTIPLIER uint64(1)
INACTIVITY_PENALTY_QUOTIENT_ALTAIR uint64(3 * 2**24) (= 50,331,648)
MIN_SLASHING_PENALTY_QUOTIENT_ALTAIR uint64(2**6) (= 64)
PROPORTIONAL_SLASHING_MULTIPLIER_ALTAIR uint64(2)
INACTIVITY_PENALTY_QUOTIENT_BELLATRIX uint64(2**24) (= 16,777,216)
MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX uint64(2**5) (= 32)
PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX uint64(3)

Note that there are similar constants with different values here.

  • The original beacon chain Phase 0 constants have no suffix.
  • Constants updated in the Altair upgrade have the suffix _ALTAIR.
  • Constants updated in the Bellatrix upgrade have the suffix _BELLATRIX.

This is explained in the specs repo as follows:

Variables are not replaced but extended with forks. This is to support syncing from one state to another over a fork boundary, without hot-swapping a config. Instead, for forks that introduce changes in a variable, the variable name is suffixed with the fork name.

BASE_REWARD_FACTOR

This is the big knob to turn to change the issuance rate of Eth2. Almost all validator rewards are calculated in terms of a "base reward per increment" which is formulated as,

  EFFECTIVE_BALANCE_INCREMENT * BASE_REWARD_FACTOR // integer_squareroot(get_total_active_balance(state))

Thus, the total validator rewards per epoch (the Eth2 issuance rate) could be tuned by increasing or decreasing BASE_REWARD_FACTOR.

The exception is proposer rewards for including slashing reports in blocks. However, these are more than offset by the amount of stake burnt, so do not increase the overall issuance rate.

WHISTLEBLOWER_REWARD_QUOTIENT

One reward that is not tied to the base reward is the whistleblower reward. This is an amount awarded to the proposer of a block containing one or more proofs that a proposer or attester has violated a slashing condition. The whistleblower reward is set at 1512\frac{1}{512} of the effective balance of the slashed validator.

The whistleblower reward comes from new issuance of Ether on the beacon chain, but is more than offset by the Ether burned due to slashing penalties.

PROPOSER_REWARD_QUOTIENT

PROPOSER_REWARD_QUOTIENT was removed in the Altair upgrade in favour of PROPOSER_WEIGHT. It was used to apportion rewards between attesters and proposers when including attestations in blocks.

INACTIVITY_PENALTY_QUOTIENT_BELLATRIX

This value supersedes INACTIVITY_PENALTY_QUOTIENT and INACTIVITY_PENALTY_QUOTIENT_ALTAIR.

If the beacon chain hasn't finalised a checkpoint for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs, then it enters "leak" mode. In this mode, any validator that does not vote (or votes for an incorrect target) is penalised an amount each epoch of (effective_balance * inactivity_score) // ( INACTIVITY_SCORE_BIAS * INACTIVITY_PENALTY_QUOTIENT_BELLATRIX ).

Since the Altair upgrade, inactivity_score has become a per-validator quantity, whereas previously validators were penalised by a globally calculated amount when they missed a duty during a leak. See inactivity penalties for more on the rationale for this and how this score is calculated per validator.

During a leak, no validators receive rewards, and they continue to accrue the normal penalties when they fail to fulfil duties. In addition, for epochs in which validators do not make a correct, timely target vote, they receive a leak penalty.

To examine the effect of the leak on a single validator's balance, assume that during a period of inactivity leak (non-finalisation) the validator is completely offline. At each epoch, the offline validator will be penalised an extra amount nB/αnB / \alpha, where nn is the number of epochs since the leak started, BB is the validator's effective balance, and α\alpha is the prevailing inactivity penalty quotient (currently INACTIVITY_PENALTY_QUOTIENT_BELLATRIX).

The effective balance BB will remain constant for a while, by design, during which time the total amount of the penalty after nn epochs would be n(n+1)B/2αn(n+1)B / 2\alpha. This is sometimes called the "quadratic leak" since it grows as n2n^2 to first order. If BB were continuously variable, the penalty would satisfy dBdt=Btα\frac{dB}{dt}=-\frac{Bt}{\alpha}, which can be solved to give B(t)=B0et2/2αB(t)=B_0e^{-t^2/2\alpha}. The actual behaviour is somewhere between these two (piecewise quadratic) since the effective balance is neither constant nor continuously variable but decreases in a step-wise fashion.

In the continuous approximation, the inactivity penalty quotient, α\alpha, is the square of the time it takes to reduce the balance of a non-participating validator to 1/e1 / \sqrt{e}, or around 60.7% of its initial value. With the value of INACTIVITY_PENALTY_QUOTIENT_BELLATRIX at 2**24, this equates to 4096 epochs, or 18.2 days.

The idea for the inactivity leak (aka the quadratic leak) was proposed in the original Casper FFG paper. The problem it addresses is that, if a large fraction of the validator set were to go offline at the same time, it would not be possible to continue finalising checkpoints, since a majority vote from validators representing 2/3 of the total stake is required for finalisation.

In order to recover, the inactivity leak gradually reduces the stakes of validators who are not making attestations until, eventually, the participating validators control 2/3 of the remaining stake. They can then begin to finalise checkpoints once again.

This inactivity penalty mechanism is designed to protect the chain long-term in the face of catastrophic events (sometimes referred to as the ability to survive World War III). The result might be that the beacon chain could permanently split into two independent chains either side of a network partition, and this is assumed to be a reasonable outcome for any problem that can't be fixed in a few weeks. In this sense, the beacon chain formally prioritises availability over consistency. (You can't have both.)

The value of INACTIVITY_PENALTY_QUOTIENT was increased by a factor of four from 2**24 to 2**26 for the beacon chain launch, with the intention of penalising validators less severely in case of non-finalisation due to implementation problems in the early days. As it happens, there were no instances of non-finalisation during the eleven months of Phase 0 of the beacon chain.

The value was decreased by one quarter in the Altair upgrade from 2**26 (INACTIVITY_PENALTY_QUOTIENT) to 3 * 2**24 (INACTIVITY_PENALTY_QUOTIENT_ALTAIR), and to its final value of 2**24 (INACTIVITY_PENALTY_QUOTIENT_BELLATRIX) in the Bellatrix upgrade. Decreasing the inactivity penalty quotient speeds up recovery of finalisation in the event of an inactivity leak.

MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX

When a validator is first convicted of a slashable offence, an initial penalty is applied. This is calculated as, validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX.

Thus, the initial slashing penalty is between 0.5 ETH and 1 ETH depending on the validator's effective balance (which is between 16 and 32 Ether; note that effective balance is denominated in Gwei).

A further slashing penalty is applied later based on the total amount of balance slashed during a period of EPOCHS_PER_SLASHINGS_VECTOR.

The value of MIN_SLASHING_PENALTY_QUOTIENT was increased by a factor of four from 2**5 to 2**7 for the beacon chain launch, anticipating that unfamiliarity with the rules of Ethereum 2.0 staking was likely to result in some unwary users getting slashed. In the event, a total of 157 validators were slashed during Phase 0, all as a result of user error or misconfiguration as far as can be determined.

The value of this parameter was halved in the Altair upgrade from 2**7 (MIN_SLASHING_PENALTY_QUOTIENT) to 2**6 (MIN_SLASHING_PENALTY_QUOTIENT_ALTAIR), and set to its final value of 2**5 (MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX) in the Bellatrix upgrade.

PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX

When a validator has been slashed, a further penalty is later applied to the validator based on how many other validators were slashed during a window of size EPOCHS_PER_SLASHINGS_VECTOR epochs centred on that slashing event (approximately 18 days before and after).

The proportion of the validator's remaining effective balance that will be subtracted is calculated as, PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX multiplied by the sum of the effective balances of the slashed validators in the window, divided by the total effective balance of all validators. The idea of this mechanism is to punish accidents lightly (in which only a small number of validators were slashed) and attacks heavily (where many validators coordinated to double vote).

To finalise conflicting checkpoints, at least a third of the balance must have voted for both. That's why the "natural" setting of PROPORTIONAL_SLASHING_MULTIPLIER is three: in the event of an attack that finalises conflicting checkpoints, the attackers lose their entire stake. This provides "the maximal minimum accountable safety margin".

However, for the initial stage of the beacon chain, Phase 0, PROPORTIONAL_SLASHING_MULTIPLIER was set to one. It was increased to two at the Altair upgrade, and to its final value of three at the Bellatrix upgrade. The lower values provided some insurance against client bugs that might have caused mass slashings in the early days.

Max operations per block

Name Value
MAX_PROPOSER_SLASHINGS 2**4 (= 16)
MAX_ATTESTER_SLASHINGS 2**1 (= 2)
MAX_ATTESTATIONS 2**7 (= 128)
MAX_DEPOSITS 2**4 (= 16)
MAX_VOLUNTARY_EXITS 2**4 (= 16)
MAX_BLS_TO_EXECUTION_CHANGES 2**4 (= 16)

These parameters are used to size lists in the beacon block bodies for the purposes of SSZ serialisation, as well as constraining the maximum size of beacon blocks so that they can propagate efficiently, and avoid DoS attacks.

Some comments on the chosen values:

  • I have suggested elsewhere reducing MAX_DEPOSITS from sixteen to one to ensure that more validators must process deposits, which encourages them to run Eth1 clients.
  • At first sight, there looks to be a disparity between the number of proposer slashings and the number of attester slashings that may be included in a block. But note that an attester slashing (a) can be much larger than a proposer slashing, and (b) can result in many more validators getting slashed than a proposer slashing.
  • MAX_ATTESTATIONS is double the value of MAX_COMMITTEES_PER_SLOT. This allows there to be an empty slot (with no block proposal), yet still include all the attestations for the empty slot in the next slot. Since, ideally, each committee produces a single aggregate attestation, a block can hold two slots' worth of aggregate attestations.

Sync committee

Name Value Unit Duration
SYNC_COMMITTEE_SIZE uint64(2**9) (= 512) Validators
EPOCHS_PER_SYNC_COMMITTEE_PERIOD uint64(2**8) (= 256) epochs ~27 hours

Sync committees were introduced by the Altair upgrade to allow light clients to quickly and trustlessly determine the head of the beacon chain.

Why did we need a new committee type? Couldn't this be built on top of existing committees, say committees 0 to 3 at a slot? After all, voting for the head of the chain is already one of their duties. The reason is that it is important for reducing the load on light clients that sync committees do not change very often. Standard committees change every slot; we need something much longer lived here.

Only a single sync committee is active at any one time, and contains a randomly selected subset of size SYNC_COMMITTEE_SIZE of the total validator set.

A sync committee does its duties (and receives rewards for doing so) for only EPOCHS_PER_SYNC_COMMITTEE_PERIOD epochs until the next committee takes over.

With 500,000 validators, the expected time between being selected for sync committee duty is around 37 months. The probability of being in the current sync committee would be 512500,000\frac{512}{500{,}000} per validator.

SYNC_COMMITTEE_SIZE is a trade-off between security (ensuring that enough honest validators are always present) and efficiency for light clients (ensuring that they do not have to handle too much computation). The value 512 is conservative in terms of safety. It would be catastrophic for trustless bridges to other protocols, for example, if a sync committee voted in an invalid block.

EPOCHS_PER_SYNC_COMMITTEE_PERIOD is around a day, and again is a trade-off between security (short enough that it's hard for an attacker to find and corrupt committee members) and efficiency (reducing the data load on light clients).

Execution

Name Value
MAX_BYTES_PER_TRANSACTION uint64(2**30) (= 1,073,741,824)
MAX_TRANSACTIONS_PER_PAYLOAD uint64(2**20) (= 1,048,576)
BYTES_PER_LOGS_BLOOM uint64(2**8) (= 256)
MAX_EXTRA_DATA_BYTES 2**5 (= 32)
MAX_WITHDRAWALS_PER_PAYLOAD uint64(2**4) (= 16)

These first four of these constants were introduced at the Bellatrix pre-Merge upgrade and are used only to size some fields within the ExecutionPayload class.

The execution payload (formerly known as an Eth1 block) contains a list of up to MAX_TRANSACTIONS_PER_PAYLOAD normal Ethereum transactions. Each of these has size up to MAX_BYTES_PER_TRANSACTION. These constants are needed only because SSZ list types require a maximum size to be specified. They are set ludicrously large, but that's not a problem in practice.

BYTES_PER_LOGS_BLOOM and MAX_EXTRA_DATA_BYTES are a direct carry-over from Eth1 blocks as specified in the Yellow Paper, being the size of a block's Bloom filter and the size of a block's extra data field respectively. The execution payload's extra data is analogous to a beacon block's graffiti - the block builder can set it to any value they choose.

MAX_WITHDRAWALS_PER_PAYLOAD was introduced at the Capella upgrade. It is the maximum number of withdrawals that the consensus client will ask the execution client to include in an execution payload. As a consequence, the rate of withdrawals is limited to MAX_WITHDRAWALS_PER_PAYLOAD per slot.

Withdrawals processing

Name Value
MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP 16384 (= 2**14 )

This preset constant was introduced in the Capella upgrade to bound the amount of work each node wold need to do when processing withdrawals.

The number of withdrawal transactions per block is bounded at MAX_WITHDRAWALS_PER_PAYLOAD. But not all validators will be eligible for a withdrawal transaction, meaning that nodes might have to search indefinitely through the validator set to find enough withdrawals to include. Searching the validator set can be an expensive operation, therefore we bound the search, considering only MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP validators per block. If we find fewer withdrawable validators than MAX_WITHDRAWALS_PER_PAYLOAD then we make fewer withdrawal transactions.

The primary reason that a validator might not be withdrawable is that it still has an old 0x00 BLS withdrawal credential. At least in the early days of Capella this could have led to long stretches of the validator set that were not withdrawable, before many validators had updated their credentials.

Another reason might be having an effective balance lower than MAX_EFFECTIVE_BALANCE. In the event of an inactivity leak this could also lead to long stretches of validators being non-withdrawable.

Configuration

Genesis Settings

Beacon chain genesis is long behind us. Nevertheless, the ability to spin-up testnets is useful in all sorts of scenarios, so the spec retains genesis functionality, now called initialisation.

The following parameters refer to the actual mainnet beacon chain genesis, and I'll explain them in that context. When starting up new testnets, these will of course be changed. For example, see the configuration file for the Prater testnet.

Name Value
MIN_GENESIS_ACTIVE_VALIDATOR_COUNT uint64(2**14) (= 16,384)
MIN_GENESIS_TIME uint64(1606824000) (Dec 1, 2020, 12pm UTC)
GENESIS_FORK_VERSION Version('0x00000000')
GENESIS_DELAY uint64(604800) (7 days)
MIN_GENESIS_ACTIVE_VALIDATOR_COUNT

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT is the minimum number of full validator stakes that must have been deposited before the beacon chain can start producing blocks. The number is chosen to ensure a degree of security. It allows for four 128 member committees per slot, rather than the 64 committees per slot eventually desired to support the fully operational data shards that were on the roadmap at that time (but no longer). Fewer validators means higher rewards per validator, so it is designed to attract early participants to get things bootstrapped.

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT used to be much higher (65,536 = 2 million Ether staked), but was reduced when MIN_GENESIS_TIME, below, was added.

In the actual event of beacon chain genesis, there were 21,063 participating validators, comfortably exceeding the minimum necessary count.

MIN_GENESIS_TIME

MIN_GENESIS_TIME is the earliest date that the beacon chain can start.

Having a MIN_GENESIS_TIME allows us to start the chain with fewer validators than was previously thought necessary. The previous plan was to start the chain as soon as there were MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators staked. But there were concerns that with a lowish initial validator count, a single entity could form the majority of them and then act to prevent other validators from entering (a "gatekeeper attack"). A minimum genesis time allows time for all those who wish to make deposits to do so before they could be excluded by a gatekeeper attack.

The beacon chain actually started at 12:00:23 UTC on the 1st of December 2020. The extra 23 seconds comes from the timestamp of the first Eth1 block to meet the genesis criteria, block 11320899. I like to think of this as a little remnant of proof of work forever embedded in the beacon chain's history.

GENESIS_FORK_VERSION

Unlike Ethereum 1.0, the beacon chain gives in-protocol versions to its forks. See the Version custom type for more explanation.

GENESIS_FORK_VERSION is the fork version the beacon chain starts with at its "genesis" event: the point at which the chain first starts producing blocks. Nowadays, this value is used only when computing the cryptographic domain for deposit messages and BLS credential change messages, which are valid across all forks.

Fork versions and timings for the Altair, Bellatrix, and Capella upgrades are defined in their respective specifications as follows.

Name Value
ALTAIR_FORK_VERSION Version('0x01000000')
ALTAIR_FORK_EPOCH Epoch(74240) (Oct 27, 2021, 10:56:23am UTC)
BELLATRIX_FORK_VERSION Version('0x02000000')
BELLATRIX_FORK_EPOCH Epoch(144896) (Sept 6, 2022, 11:34:47am UTC)
CAPELLA_FORK_VERSION Version('0x03000000')
CAPELLA_FORK_EPOCH Epoch(194048) (April 12, 2023, 10:27:35pm UTC)
GENESIS_DELAY

The GENESIS_DELAY is a grace period to allow nodes and node operators time to prepare for the genesis event. The genesis event cannot occur before MIN_GENESIS_TIME. If MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators are not registered sufficiently in advance of MIN_GENESIS_TIME, then Genesis will occur GENESIS_DELAY seconds after enough validators have been registered.

Seven days' notice was regarded as sufficient to allow client dev teams time to make a release once the genesis parameters were known, and for node operators to upgrade to that release. And, of course, to organise some parties. It was increased from 2 days over time due to lessons learned on some of the pre-genesis testnets.

Time parameters

Name Value Unit Duration
SECONDS_PER_SLOT uint64(12) seconds 12 seconds
SECONDS_PER_ETH1_BLOCK uint64(14) seconds 14 seconds
MIN_VALIDATOR_WITHDRAWABILITY_DELAY uint64(2**8) (= 256) epochs ~27 hours
SHARD_COMMITTEE_PERIOD uint64(2**8) (= 256) epochs ~27 hours
ETH1_FOLLOW_DISTANCE uint64(2**11) (= 2,048) Eth1 blocks ~8 hours
SECONDS_PER_SLOT

This was originally six seconds, but is now twelve, and has been other values in between.

Network delays are the main limiting factor in shortening the slot length. Three communication activities need to be accomplished within a slot, and it is supposed that four seconds is enough for the vast majority of nodes to have participated in each:

  1. Blocks are proposed at the start of a slot and should have propagated to most of the network within the first four seconds.
  2. At four seconds into a slot, committee members create and broadcast attestations, including attesting to this slot's block. During the next four seconds, these attestations are collected by aggregators in each committee.
  3. At eight seconds into the slot, the aggregators broadcast their aggregate attestations which then have four seconds to reach the validator who is proposing the next block.

There is a general intention to shorten the slot time in future, perhaps to 8 seconds, if it proves possible to do this in practice. Or perhaps to lengthen it to 16 seconds.

Post-Merge, the time taken by the execution client to validate the execution payload contents (that is, the normal Ethereum transactions) is now on the critical path for validators during step 1, the first four seconds. In order for the validator to attest correctly, the beacon block must first be broadcast, propagated and received, then validated by the consensus client, and also validated by the execution client, all within that initial four-second window. In borderline cases, the extra time taken by execution validation can push the whole process beyond the four-second point at which attestations must be made. This can lead to voting incorrectly for an empty slot. See Adrian Sutton's article Understanding Attestation Misses for further explanation.

SECONDS_PER_ETH1_BLOCK

The assumed block interval on the Eth1 chain, used in conjunction with ETH1_FOLLOW_DISTANCE when considering blocks on the Eth1 chain, either at genesis, or when voting on the deposit contract state.

The average Eth1 block time since January 2020 has actually been nearer 13 seconds, but never mind. The net effect is that we will be going a little deeper back in the Eth1 chain than ETH1_FOLLOW_DISTANCE would suggest, which ought to be safer.

MIN_VALIDATOR_WITHDRAWABILITY_DELAY

A validator can stop participating once it has made it through the exit queue. However, its stake remains locked for the duration of MIN_VALIDATOR_WITHDRAWABILITY_DELAY. This is to allow some time for any slashable behaviour to be detected and reported so that the validator can still be penalised (in which case the validator's withdrawable time is pushed EPOCHS_PER_SLASHINGS_VECTOR into the future).

Once the MIN_VALIDATOR_WITHDRAWABILITY_DELAY period has passed, the validator becomes eligible for a full withdrawal of its stake and rewards on the next withdrawals sweep, as long as it has ETH1_ADDRESS_WITHDRAWAL_PREFIX (0x01) withdrawal credentials set. In any case, being in a "withdrawable" state means that a validator has now fully exited from the protocol.

SHARD_COMMITTEE_PERIOD

This really anticipates the implementation of data shards, which is no longer planned, at least in its originally envisaged form. The idea is that it's bad for the stability of longer-lived committees if validators can appear and disappear very rapidly. Therefore, a validator cannot initiate a voluntary exit until SHARD_COMMITTEE_PERIOD epochs after it has been activated. However, it could still be ejected by slashing before this time.

ETH1_FOLLOW_DISTANCE

This is used to calculate the minimum depth of block on the Ethereum 1 chain that can be considered by the Eth2 chain: it applies to the Genesis process and the processing of deposits by validators. The Eth1 chain depth is estimated by multiplying this value by the target average Eth1 block time, SECONDS_PER_ETH1_BLOCK.

The value of ETH1_FOLLOW_DISTANCE is not based on the expected depth of any reorgs of the Eth1 chain, which are rarely if ever more than 2-3 blocks deep. It is about providing time to respond to an incident on the Eth1 chain such as a consensus failure between clients.

This parameter was increased from 1024 to 2048 blocks for the beacon chain mainnet, to allow devs more time to respond if there were any trouble on the Eth1 chain.

The whole follow distance concept has been made redundant by the Merge and may be removed in a future upgrade, so that validators can make deposits and become active more-or-less instantly.

Validator Cycle

Name Value
EJECTION_BALANCE Gwei(2**4 * 10**9) (= 16,000,000,000)
MIN_PER_EPOCH_CHURN_LIMIT uint64(2**2) (= 4)
CHURN_LIMIT_QUOTIENT uint64(2**16) (= 65,536)
EJECTION_BALANCE

If a validator's effective balance falls to 16 Ether or below then it is exited from the system. This is most likely to happen as a result of the "inactivity leak", which gradually reduces the balances of inactive validators in order to maintain the liveness of the beacon chain.

This mechanism is intended to protect stakers who no longer have access to their keys. If a validator has been offline for long enough to lose half of its balance, it is unlikely to be coming back. To save the staker from losing everything, we choose to eject the validator before its balance reaches zero.

Note that the dependence on effective balance means that the validator is queued for ejection as soon as its actual balance falls to 16.75 Ether.

MIN_PER_EPOCH_CHURN_LIMIT

Validators are allowed to exit the system and cease validating, and new validators may apply to join at any time. For interesting reasons, a design decision was made to apply a rate-limit to entries (activations) and exits. Basically, it is important in proof of stake protocols that the validator set not change too quickly.

In the normal case, a validator is able to exit fairly swiftly: it just needs to wait MAX_SEED_LOOKAHEAD (currently four) epochs. However, if a large number of validators wishes to exit at the same time, a queue forms with a limited number of exits allowed per epoch. The minimum number of exits per epoch (the minimum "churn") is MIN_PER_EPOCH_CHURN_LIMIT, so that validators can always eventually exit. The actual allowed churn per epoch is calculated in conjunction with CHURN_LIMIT_QUOTIENT.

The same applies to new validator activations, once a validator has been marked as eligible for activation.

The rate at which validators can exit is strongly related to the concept of weak subjectivity, and the weak subjectivity period.

CHURN_LIMIT_QUOTIENT

This is used in conjunction with MIN_PER_EPOCH_CHURN_LIMIT to calculate the actual number of validator exits and activations allowed per epoch. The number of exits allowed is max(MIN_PER_EPOCH_CHURN_LIMIT, n // CHURN_LIMIT_QUOTIENT), where n is the number of active validators. The same applies to activations.

Inactivity penalties

Name Value Description
INACTIVITY_SCORE_BIAS uint64(2**2) (= 4) score points per inactive epoch
INACTIVITY_SCORE_RECOVERY_RATE uint64(2**4) (= 16) score points per leak-free epoch
INACTIVITY_SCORE_BIAS

If the beacon chain hasn't finalised an epoch for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs, then it enters "leak" mode. In this mode, any validator that does not vote (or votes for an incorrect target) is penalised an amount each epoch of (effective_balance * inactivity_score) // (INACTIVITY_SCORE_BIAS * INACTIVITY_PENALTY_QUOTIENT_BELLATRIX). See INACTIVITY_PENALTY_QUOTIENT_BELLATRIX for discussion of the inactivity leak itself.

The per-validator inactivity-score was introduced in the Altair upgrade. During Phase 0, inactivity penalties were an increasing global amount applied to all validators that did not participate in an epoch, regardless of their individual track records of participation. So a validator that was able to participate for a significant fraction of the time nevertheless could be quite severely penalised due to the growth of the per-epoch inactivity penalty. Vitalik gives a simplified example: "if fully [off]line validators get leaked and lose 40% of their balance, someone who has been trying hard to stay online and succeeds at 90% of their duties would still lose 4% of their balance. Arguably this is unfair."

In addition, if many validators are able to participate intermittently, it indicates that whatever event has befallen the chain is potentially recoverable (unlike a permanent network partition, or a super-majority network fork, for example). The inactivity leak is intended to bring finality to irrecoverable situations, so prolonging the time to finality if it's not irrecoverable is likely a good thing.

Each validator has an individual inactivity score in the beacon state which is updated by process_inactivity_updates() as follows.

  • Every epoch, irrespective of the inactivity leak,
    • decrease the score by one when the validator makes a correct timely target vote, and
    • increase the score by INACTIVITY_SCORE_BIAS otherwise.
  • When not in an inactivity leak
    • decrease every validator's score by INACTIVITY_SCORE_RECOVERY_RATE.

There is a floor of zero on the score. So, outside a leak, validators' scores will rapidly return to zero and stay there, since INACTIVITY_SCORE_RECOVERY_RATE is greater than INACTIVITY_SCORE_BIAS.

When in a leak, if pp is the participation rate between 00 and 11, and λ\lambda is INACTIVITY_SCORE_BIAS, then the expected score after NN epochs is max(0,N((1p)λp))\max (0, N((1-p)\lambda - p)). For λ=4\lambda = 4 this is max(0,N(45p))\max (0, N(4 - 5p)). So a validator that is participating 80% of the time or more can maintain a score that is bounded near zero. With less than 80% average participation, its score will increase unboundedly.

INACTIVITY_SCORE_RECOVERY_RATE

When not in an inactivity leak, validators' inactivity scores are reduced by INACTIVITY_SCORE_RECOVERY_RATE + 1 per epoch when they make a timely target vote, and by INACTIVITY_SCORE_RECOVERY_RATE - INACTIVITY_SCORE_BIAS when they don't. So, even for non-performing validators, scores decrease three times faster than they increase.

The new scoring system means that some validators will continue to be penalised due to the leak, even after finalisation starts again. This is intentional. When the leak causes the beacon chain to finalise, at that point we have just 2/3 of the stake online. If we immediately stop the leak (as we used to), then the amount of stake online would remain close to 2/3 and the chain would be vulnerable to flipping in and out of finality as small numbers of validators come and go. We saw this behaviour on some of the testnets prior to launch. Continuing the leak after finalisation serves to increase the balances of participating validators to greater than 2/3, providing a margin that should help to prevent such behaviour.

See the section on the Inactivity Leak for some more analysis of the inactivity score and some graphs of its effect.

Transition settings

Name Value
TERMINAL_TOTAL_DIFFICULTY 58750000000000000000000
TERMINAL_BLOCK_HASH Hash32()
TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH FAR_FUTURE_EPOCH

These values are not used in the main beacon chain specification, but are used in the Bellatrix fork choice and validator guide to determine the point of handover from proof of work to proof of stake for the execution chain.

All previous upgrades to the Ethereum proof of work chain took place at a pre-defined block height. That approach was deemed to be insecure for the Merge due to the irreversible dynamics of the switch to proof of stake. The rationale is given in the Security Considerations section of EIP-3675.

Using a pre-defined block number for the hardfork is unsafe in this context due to the PoS fork choice taking priority during the transition.

An attacker may use a minority of hash power to build a malicious chain fork that would satisfy the block height requirement. Then the first PoS block may be maliciously proposed on top of the PoW block from this adversarial fork, becoming the head and subverting the security of the transition.

To protect the network from this attack scenario, difficulty accumulated by the chain (total difficulty) is used to trigger the upgrade.

Thus, the Bellatrix upgrade defined a terminal total difficulty (TTD) at which the Merge would take place. Each block on the Ethereum proof of work chain has a "difficulty" associated with it, which corresponds to the expected number of hashes it would take to mine it. The total difficulty is the monotonically increasing accumulated difficulty of all the blocks so far.

The first block to exceed TERMINAL_TOTAL_DIFFICULTY was Ethereum block number 15537393. That block became the last canonical block to be produced under proof of work. The next execution payload was included in the beacon chain at slot 4700013, which was produced at 06:42:59 UTC on September the 15th, 2022.

TERMINAL_BLOCK_HASH and TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH are present in case a need arose to manually select a particular proof of work fork to follow in case of trouble. TERMINAL_BLOCK_HASH would have been set in clients, by a manual override or a client update, to point to a specific proof of work block chosen by agreement to be the terminal block. In the event this functionality was not needed.

Containers

Preamble

We are about to see our first Python code in the executable spec. For specification purposes, these Container data structures are just Python data classes that are derived from the base SSZ Container class.

SSZ is the serialisation and Merkleization format used everywhere in Ethereum 2.0. It is not self-describing, so you need to know ahead of time what you are unpacking when deserialising. SSZ deals with basic types and composite types. Classes like the below are handled as SSZ containers, a composite type defined as an "ordered heterogeneous collection of values".

Client implementations in different languages will obviously use their own paradigms to represent these data structures.

Two notes directly from the spec:

  • The definitions are ordered topologically to facilitate execution of the spec.
  • Fields missing in container instantiations default to their zero value.

Misc dependencies

Fork

class Fork(Container):
    previous_version: Version
    current_version: Version
    epoch: Epoch  # Epoch of latest fork

Fork data is stored in the BeaconState to indicate the current and previous fork versions. The fork version gets incorporated into the cryptographic domain in order to invalidate messages from validators on other forks. The previous fork version and the epoch of the change are stored so that pre-fork messages can still be validated (at least until the next fork). This ensures continuity of attestations across fork boundaries.

Note that this is all about planned protocol forks (upgrades), and nothing to do with the fork-choice rule, or inadvertent forks due to errors in the state transition.

ForkData

class ForkData(Container):
    current_version: Version
    genesis_validators_root: Root

ForkData is used only in compute_fork_data_root(). This is used when distinguishing between chains for the purpose of peer-to-peer gossip, and for domain separation. By including both the current fork version and the genesis validators root, we can cleanly distinguish between, say, mainnet and a testnet. Even if they have the same fork history, the genesis validators roots will differ.

Version is the datatype for a fork version number.

Checkpoint

class Checkpoint(Container):
    epoch: Epoch
    root: Root

Checkpoints are the points of justification and finalisation used by the Casper FFG protocol. Validators use them to create AttestationData votes, and the status of recent checkpoints is recorded in BeaconState.

As per the Casper paper, checkpoints contain a height, and a block root. In this implementation of Casper FFG, checkpoints occur whenever the slot number is a multiple of SLOTS_PER_EPOCH, thus they correspond to epoch numbers. In particular, checkpoint NN is the first slot of epoch NN. The genesis block is checkpoint 0, and starts off both justified and finalised.

Thus, the root element here is the block root of the first block in the epoch. (This might be the block root of an earlier block if some slots have been skipped, that is, if there are no blocks for those slots.).

It is very common to talk about justifying and finalising epochs. This is not strictly correct: checkpoints are justified and finalised.

Once a checkpoint has been finalised, the slot it points to and all prior slots will never be reverted.

Validator

class Validator(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32  # Commitment to pubkey for withdrawals
    effective_balance: Gwei  # Balance at stake
    slashed: boolean
    # Status epochs
    activation_eligibility_epoch: Epoch  # When criteria for activation were met
    activation_epoch: Epoch
    exit_epoch: Epoch
    withdrawable_epoch: Epoch  # When validator can withdraw funds

This is the data structure that stores most of the information about an individual validator, with only validators' balances and inactivity scores stored elsewhere.

Validators' actual balances are stored separately in the BeaconState structure, and only the slowly changing "effective balance" is stored here. This is because actual balances are liable to change quite frequently (at least every epoch, and sometimes more frequently): the Merkleization process used to calculate state roots means that only the parts that change need to be recalculated; the roots of unchanged parts can be cached. Separating out the validator balances potentially means that only 1/15th (8/121) as much data needs to be rehashed every epoch compared to storing them here, which is an important optimisation.

For similar reasons, validators' inactivity scores are stored outside validator records as well, as they are also updated every epoch.

A validator's record is created when its deposit is first processed. Sending multiple deposits does not create multiple validator records: deposits with the same public key are aggregated in one record. Validator records never expire; they are stored permanently, even after the validator has exited the system. Thus, there is a 1:1 mapping between a validator's index in the list and the identity of the validator (validator records are only ever appended to the list).

Also stored in Validator:

  • pubkey serves as both the unique identity of the validator and the means of cryptographically verifying messages purporting to have been signed by it. The public key is stored raw, unlike in Eth1, where it is hashed to form the account address. This allows public keys to be aggregated for verifying aggregated attestations.
  • Depending on its prefix, withdrawal_credentials might specify an Eth1 account to which withdrawal transactions will be made, or it might be an old-style BLS commitment that needs to be updated before withdrawals can occur for that validator. The withdrawal credentials included in a validator's deposit data are not checked in any way by the consensus layer.
  • effective_balance is a topic of its own that we've touched upon already, and will discuss more fully when we look at effective balances updates.
  • slashed indicates that a validator has been slashed, that is, punished for violating the slashing conditions. A validator can be slashed only once.
  • The remaining values are the epochs in which the validator changed, or is due to change state.

A detailed explanation of the stages in a validator's lifecycle is here, and we'll be covering it in detail as we work through the beacon chain logic. But, in simplified form, progress is as follows:

  1. A 32 ETH deposit has been made on the Ethereum 1 chain. No validator record exists yet.
  2. The deposit is processed by the beacon chain at some slot. A validator record is created with all epoch fields set to FAR_FUTURE_EPOCH.
  3. At the end of the current epoch, the activation_eligibility_epoch is set to the next epoch.
  4. After the epoch activation_eligibility_epoch has been finalised, the validator is added to the activation queue by setting its activation_epoch appropriately, taking into account the per-epoch churn limit and MAX_SEED_LOOKAHEAD.
  5. On reaching activation_epoch the validator becomes active, and should carry out its duties.
  6. At any time after SHARD_COMMITTEE_PERIOD epochs have passed, a validator may request a voluntary exit. exit_epoch is set according to the validator's position in the exit queue and MAX_SEED_LOOKAHEAD, and withdrawable_epoch is set MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after that.
  7. From exit_epoch onward the validator is no longer active. There is no mechanism for exited validators to rejoin: exiting is permanent.
  8. After withdrawable_epoch, the validator's full stake can be withdrawn.

The above does not account for slashing or forced exits due to low balance.

AttestationData

class AttestationData(Container):
    slot: Slot
    index: CommitteeIndex
    # LMD GHOST vote
    beacon_block_root: Root
    # FFG vote
    source: Checkpoint
    target: Checkpoint

The beacon chain relies on a combination of two different consensus mechanisms: LMD GHOST keeps the chain moving, and Casper FFG brings finalisation. These are documented in the Gasper paper. Attestations from (committees of) validators are used to provide votes simultaneously for each of these consensus mechanisms.

This container is the fundamental unit of attestation data. It provides the following elements.

  • slot: each active validator should be making exactly one attestation per epoch. Validators have an assigned slot for their attestation, and it is recorded here for validation purposes.
  • index: there can be several committees active in a single slot. This is the number of the committee that the validator belongs to in that slot. It can be used to reconstruct the committee to check that the attesting validator is a member. Ideally, all (or the majority at least) of the attestations in a slot from a single committee will be identical, and can therefore be aggregated into a single aggregate attestation.
  • beacon_block_root is the validator's vote on the head block for that slot after locally running the LMD GHOST fork-choice rule. It may be the root of a block from a previous slot if the validator believes that the current slot is empty.
  • source is the validator's opinion of the best currently justified checkpoint for the Casper FFG finalisation process.
  • target is the validator's opinion of the block at the start of the current epoch, also for Casper FFG finalisation.

This AttestationData structure gets wrapped up into several other similar but distinct structures:

  • Attestation is the form in which attestations normally make their way around the network. It is signed and aggregatable, and the list of validators making this attestation is compressed into a bitlist.
  • IndexedAttestation is used primarily for attester slashing. It is signed and aggregated, with the list of attesting validators being an uncompressed list of indices.
  • PendingAttestation. In Phase 0, after having their validity checked during block processing, PendingAttestations were stored in the beacon state pending processing at the end of the epoch. This was reworked in the Altair upgrade and PendingAttestations are no longer used.

IndexedAttestation

class IndexedAttestation(Container):
    attesting_indices: List[ValidatorIndex, MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is one of the forms in which aggregated attestations – combined identical attestations from multiple validators in the same committee – are handled.

Attestations and IndexedAttestations contain essentially the same information. The difference being that the list of attesting validators is stored uncompressed in IndexedAttestations. That is, each attesting validator is referenced by its global validator index, and non-attesting validators are not included. To be valid, the validator indices must be unique and sorted, and the signature must be an aggregate signature from exactly the listed set of validators.

IndexedAttestations are primarily used when reporting attester slashing. An Attestation can be converted to an IndexedAttestation using get_indexed_attestation().

PendingAttestation

class PendingAttestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    inclusion_delay: Slot
    proposer_index: ValidatorIndex

PendingAttestations were removed in the Altair upgrade and now appear only in the process for upgrading the state during the fork. The following is provided for historical reference.

Prior to Altair, Attestations received in blocks were verified then temporarily stored in beacon state in the form of PendingAttestations, pending further processing at the end of the epoch.

A PendingAttestation is an Attestation minus the signature, plus a couple of fields related to reward calculation:

  • inclusion_delay is the number of slots between the attestation having been made and it being included in a beacon block by the block proposer. Validators are rewarded for getting their attestations included in blocks, but the reward used to decline in inverse proportion to the inclusion delay. This incentivised swift attesting and communicating by validators.
  • proposer_index is the block proposer that included the attestation. The block proposer gets a micro reward for every validator's attestation it includes, not just for the aggregate attestation as a whole. This incentivises efficient finding and packing of aggregations, since the number of aggregate attestations per block is capped.

Taken together, these rewards are designed to incentivise the whole network to collaborate to do efficient attestation aggregation (proposers want to include only well-aggregated attestations; validators want to get their attestations included, so will ensure that they get well aggregated).

This whole mechanism was replaced in the Altair upgrade by ParticipationFlags.

Eth1Data

class Eth1Data(Container):
    deposit_root: Root
    deposit_count: uint64
    block_hash: Hash32

Proposers include their view of the Ethereum 1 chain in blocks, and this is how they do it. The beacon chain stores these votes up in the beacon state until there is a simple majority consensus, then the winner is committed to beacon state. This is to allow the processing of Eth1 deposits, and creates a simple "honest-majority" one-way bridge from Eth1 to Eth2. The 1/2 majority assumption for this (rather than 2/3 for committees) is considered safe as the number of validators voting each time is large: EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH = 64 * 32 = 2048.

  • deposit_root is the result of the get_deposit_root() method of the Eth1 deposit contract after executing the Eth1 block being voted on - it's the root of the (incremental) Merkle tree of deposits.
  • deposit_count is the number of deposits in the deposit contract at that point, the result of the get_deposit_count method on the contract. This will be equal to or greater than (if there are pending unprocessed deposits) the value of state.eth1_deposit_index.
  • block_hash is the hash of the Eth1 block being voted for. This doesn't have any current use within the Eth2 protocol, but is "too potentially useful to not throw in there", to quote Danny Ryan.

HistoricalBatch

class HistoricalBatch(Container):
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]

The HistoricalBatch container has been superseded by HistoricalSummary in the Capella upgrade. It remains in the spec since the historical_roots list remains in the BeaconState, albeit it now frozen forever.

HistoricalBatch is no longer used anywhere in the state transition. However, applications validating pre-Capella data against the historical_roots list will need to use it.

See process_historical_summaries_update() for more discussion of this change.

DepositMessage

class DepositMessage(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei

The basic information necessary to either add a validator to the registry, or to top up an existing validator's stake.

pubkey is the unique public key of the validator. If it is already present in the registry (the list of validators in beacon state) then amount is added to its balance. Otherwise, a new Validator entry is appended to the list and credited with amount.

See the Validator container for more on withdrawal_credentials.

There are two protections that DepositMessages get at different points.

  1. DepositData is included in beacon blocks as a Deposit, which adds a Merkle proof that the data has been registered with the Eth1 deposit contract.
  2. When the containing beacon block is processed, deposit messages are stored, pending processing at the end of the epoch, in the beacon state as DepositData. This includes the pending validator's BLS signature so that the authenticity of the DepositMessage can be verified before a validator is added.

DepositData

class DepositData(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei
    signature: BLSSignature  # Signing over DepositMessage

A signed DepositMessage. The comment says that the signing is done over DepositMessage. What actually happens is that a DepositMessage is constructed from the first three fields; the root of that is combined with DOMAIN_DEPOSIT in a SigningData object; finally the root of this is signed and included in DepositData.

BeaconBlockHeader

class BeaconBlockHeader(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body_root: Root

A standalone version of a beacon block header: BeaconBlocks contain their own header. It is identical to BeaconBlock, except that body is replaced by body_root. It is BeaconBlock-lite.

BeaconBlockHeader is stored in beacon state to record the last processed block header. This is used to ensure that we proceed along a continuous chain of blocks that always point to their predecessor6. See process_block_header().

The signed version is used in proposer slashings.

SyncCommittee

class SyncCommittee(Container):
    pubkeys: Vector[BLSPubkey, SYNC_COMMITTEE_SIZE]
    aggregate_pubkey: BLSPubkey

Sync committees were introduced in the Altair upgrade to support light clients to the beacon chain protocol. The list of committee members for each of the current and next sync committees is stored in the beacon state. Members are updated every EPOCHS_PER_SYNC_COMMITTEE_PERIOD epochs by get_next_sync_committee().

Including the aggregate_pubkey of the sync committee is an optimisation intended to save light clients some work when verifying the sync committee's signature. All the public keys of the committee members (including any duplicates) are aggregated into this single public key. If any signatures are missing from the SyncAggregate, the light client can "de-aggregate" them by performing elliptic curve subtraction. As long as more than half of the committee contributed to the signature, then this will be faster than constructing the aggregate of participating members from scratch. If less than half contributed to the signature, the light client can start instead with the identity public key and use elliptic curve addition to aggregate those public keys that are present.

See also SYNC_COMMITTEE_SIZE.

SigningData

class SigningData(Container):
    object_root: Root
    domain: Domain

This is just a convenience container, used only in compute_signing_root() to calculate the hash tree root of an object along with a domain. The resulting root is the message data that gets signed with a BLS signature. The SigningData object itself is never stored or transmitted.

Withdrawal

class Withdrawal(Container):
    index: WithdrawalIndex
    validator_index: ValidatorIndex
    address: ExecutionAddress
    amount: Gwei

A container for handling validator balance withdrawals from the consensus layer to the execution layer. The index is a simple count of the total number of withdrawal transactions made since withdrawals were enabled in the Capella upgrade.

As per the type definition of the amount field, the consensus layer denominates withdrawals in Gwei (as it does all Ether amounts), while the execution layer denominates withdrawals in Wei (as it does all Ether amounts). Care needs to be taken when dealing with withdrawal transactions not to end up a factor of 10910^9 wrong.

HistoricalSummary

class HistoricalSummary(Container):
    """
    `HistoricalSummary` matches the components of the phase0 `HistoricalBatch`
    making the two hash_tree_root-compatible.
    """
    block_summary_root: Root
    state_summary_root: Root

This is part of the double batched accumulator mechanism implemented by process_historical_summaries_update(). It was introduced in the Capella upgrade and supersedes [HistoricalBatch] as the structure for storing roots of historical data.

The comment here is interesting. It reflects the invariant that the SSZ hash tree root of a container of objects is the same as the hash tree root of a container of the objects' hash tree roots - what I call the magic of Merkleization.

The following code demonstrates this equivalence between the pre- and post-Capella constructions. It should run with no errors.

from eth2spec.capella import mainnet
from eth2spec.capella.mainnet import *
from eth2spec.utils.ssz.ssz_typing import *

# Dummy data
roots = [Root('0x0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef')] * SLOTS_PER_HISTORICAL_ROOT
block_roots = state_roots = Vector[Root, SLOTS_PER_HISTORICAL_ROOT](*roots)

# Pre-Capella
historical_batch = HistoricalBatch(
    block_roots = block_roots,
    state_roots = state_roots)

# Post-Capella
historical_summary = HistoricalSummary(
    block_summary_root = hash_tree_root(block_roots),
    state_summary_root = hash_tree_root(state_roots))

assert(hash_tree_root(historical_batch) == hash_tree_root(historical_summary))

Beacon operations

The following are the various protocol messages that can be transmitted in a block` on the beacon chain.

For most of these, the proposer is rewarded either explicitly or implicitly for including the object in a block.

The proposer receives explicit in-protocol rewards for including the following in blocks:

  • ProposerSlashings,
  • AttesterSlashings,
  • Attestations, and
  • SyncAggregates.

Including Deposit objects in blocks is only implicitly rewarded, in that, if there are pending deposits that the block proposer does not include then the block is invalid, so the proposer receives no reward.

There is no direct reward for including VoluntaryExit objects. However, for each validator exited, rewards for the remaining validators increase very slightly, so it's still beneficial for proposers not to ignore VoluntaryExits.

ProposerSlashing

class ProposerSlashing(Container):
    signed_header_1: SignedBeaconBlockHeader
    signed_header_2: SignedBeaconBlockHeader

ProposerSlashings may be included in blocks to prove that a validator has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

In this case, the rule is that a validator may not propose two different blocks at the same height, and the payload is the signed headers of the two blocks that evidence the crime. The signatures on the SignedBeaconBlockHeaders are checked to verify that they were both signed by the accused validator.

AttesterSlashing

class AttesterSlashing(Container):
    attestation_1: IndexedAttestation
    attestation_2: IndexedAttestation

AttesterSlashings may be included in blocks to prove that one or more validators in a committee has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

The contents of the IndexedAttestations are checked against the attester slashing conditions in is_slashable_attestation_data(). If there is a violation, then any validator that attested to both attestation_1 and attestation_2 is slashed, see process_attester_slashing().

AttesterSlashings can be very large since they could in principle list the indices of all the validators in a committee. However, in contrast to proposer slashings, many validators can be slashed as a result of a single report.

Attestation

class Attestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is the form in which attestations make their way around the network. It is designed to be easily aggregatable: Attestations containing identical AttestationData can be combined into a single attestation by aggregating the signatures.

Attestations contain the same information as IndexedAttestations, but use knowledge of the validator committees at slots to compress the list of attesting validators down to a bitlist. Thus, Attestations are at least 5 times smaller than IndexedAttestations, and up to 35 times smaller (with 128 or 2048 validators per committee, respectively).

When a validator first broadcasts its attestation to the network, the aggregation_bits list will contain only a single bit set, and calling get_attesting_indices() on it will return a list containing only a single entry, the validator's own index.

Deposit

class Deposit(Container):
    proof: Vector[Bytes32, DEPOSIT_CONTRACT_TREE_DEPTH + 1]  # Merkle path to deposit root
    data: DepositData

This container is used to include deposit data from prospective validators in beacon blocks so that they can be processed into beacon state.

The proof is a Merkle proof constructed by the block proposer that the DepositData corresponds to the previously agreed deposit root of the Eth1 contract's deposit tree. It is verified in process_deposit() by is_valid_merkle_branch().

VoluntaryExit

class VoluntaryExit(Container):
    epoch: Epoch  # Earliest epoch when voluntary exit can be processed
    validator_index: ValidatorIndex

Voluntary exit messages are how a validator signals that it wants to cease being a validator. Blocks containing VoluntaryExit data for an epoch later than the current epoch are invalid, so nodes should buffer or ignore any future-dated exits they see.

VoluntaryExit objects are never used naked; they are always wrapped up into a SignedVoluntaryExit object.

SyncAggregate

class SyncAggregate(Container):
    sync_committee_bits: Bitvector[SYNC_COMMITTEE_SIZE]
    sync_committee_signature: BLSSignature

The prevailing sync committee is stored in the beacon state, so the SyncAggregates included in blocks need only use a bit vector to indicate which committee members signed off on the message.

The sync_committee_signature is the aggregate signature of all the validators referenced in the bit vector over the block root of the previous slot.

SyncAggregates are handled by process_sync_aggregate().

BLSToExecutionChange

class BLSToExecutionChange(Container):
    validator_index: ValidatorIndex
    from_bls_pubkey: BLSPubkey
    to_execution_address: ExecutionAddress

The Capella upgrade gives validators that have old style BLS withdrawal credentials a one-time opportunity to update them to Eth1 withdrawal credentials.

To make this change, a staker needs to broadcast a signed message containing this information via a consensus node. It will eventually be included in a block at which point nodes will verify it and update their validator registries. The from_bls_pubkey is verified against the validator's existing withdrawal credential, and the message's signature is verified against the from_bls_pubkey.

Beacon blocks

BeaconBlockBody

class BeaconBlockBody(Container):
    randao_reveal: BLSSignature
    eth1_data: Eth1Data  # Eth1 data vote
    graffiti: Bytes32  # Arbitrary data
    # Operations
    proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS]
    attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS]
    attestations: List[Attestation, MAX_ATTESTATIONS]
    deposits: List[Deposit, MAX_DEPOSITS]
    voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]
    sync_aggregate: SyncAggregate  # [New in Altair]
    # Execution
    execution_payload: ExecutionPayload  # [New in Bellatrix]
    # Capella operations
    bls_to_execution_changes: List[SignedBLSToExecutionChange, MAX_BLS_TO_EXECUTION_CHANGES]  # [New in Capella]

The two fundamental data structures for nodes are the BeaconBlock and the BeaconState. The BeaconBlock is how the leader (the chosen proposer in a slot) communicates network updates to all the other validators, and those validators update their own BeaconState by applying BeaconBlocks. The idea is that (eventually) all validators on the network come to agree on the same BeaconState.

Validators are randomly selected to propose beacon blocks, and there ought to be exactly one beacon block per slot if things are running correctly. If a validator is offline, or misses its slot, or proposes an invalid block, or has its block orphaned, then a slot can be empty.

The following objects are always present in a valid beacon block.

  • randao_reveal: the block is invalid if the RANDAO reveal does not verify correctly against the proposer's public key. This is the block proposer's contribution to the beacon chain's randomness. The proposer generates it by signing the current epoch number (combined with DOMAIN_RANDAO) with its private key. To the best of anyone's knowledge, the result is indistinguishable from random. This gets mixed into the beacon state RANDAO.
  • See Eth1Data for eth1_data. In principle, this is mandatory, but it is not checked, and there is no penalty for making it up.
  • graffiti is left free for the proposer to insert whatever data it wishes. It has no protocol level significance. It can be left as zero; most clients set the client name and version string as their own default graffiti value.
  • sync_aggregate is a record of which validators in the current sync committee voted for the chain head in the previous slot.
  • execution_payload is what was known as an Eth1 block pre-Merge. Ethereum transactions are now included within beacon blocks in the form of an ExecutionPayload structure.

Deposits are a special case. They are mandatory only if there are pending deposits to be processed. There is no explicit reward for including deposits, except that a block is invalid without any that ought to be there.

  • deposits: if the block does not contain either all the outstanding Deposits, or MAX_DEPOSITS of them in deposit order, then it is invalid.

Including any of the remaining objects is optional. They are handled, if present, in the process_operations() function.

The proposer earns rewards for including any of the following. Rewards for attestations and sync aggregates are available every slot. Slashings, however, are very rare.

  • proposer_slashings: up to MAX_PROPOSER_SLASHINGS ProposerSlashing objects may be included.
  • attester_slashings: up to MAX_ATTESTER_SLASHINGS AttesterSlashing objects may be included.
  • attestations: up to MAX_ATTESTATIONS (aggregated) Attestation objects may be included. The block proposer is incentivised to include well-packed aggregate attestations, as it receives a micro reward for each unique attestation. In a perfect world, with perfectly aggregated attestations, MAX_ATTESTATIONS would be equal to MAX_COMMITTEES_PER_SLOT; in our configuration it is double. This provides capacity in blocks to catch up with attestations after skip slots, and also room to include some imperfectly aggregated attestations.

Including voluntary exits and BLS to execution changes is optional, and there are no explicit rewards for doing so.

BeaconBlock

class BeaconBlock(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body: BeaconBlockBody

BeaconBlock just adds some blockchain paraphernalia to BeaconBlockBody. It is identical to BeaconBlockHeader, except that the body_root is replaced by the actual block body.

slot is the slot the block is proposed for.

proposer_index was added to avoid a potential DoS vector, and to allow clients without full access to the state to still know useful things.

parent_root is used to make sure that this block is a direct child of the last block we processed.

In order to calculate state_root, the proposer is expected to run the state transition with the block before propagating it. After the beacon node has processed the block, the state roots are compared to ensure they match. This is the mechanism for tying the whole system together and making sure that all validators and beacon nodes are always working off the same version of state (in the absence of short-term forks).

If any of these is incorrect, then the block is invalid with respect to the current beacon state and will be ignored.

Beacon state

BeaconState

class BeaconState(Container):
    # Versioning
    genesis_time: uint64
    genesis_validators_root: Root
    slot: Slot
    fork: Fork
    # History
    latest_block_header: BeaconBlockHeader
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]  # Frozen in Capella, replaced by historical_summaries
    # Eth1
    eth1_data: Eth1Data
    eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
    eth1_deposit_index: uint64
    # Registry
    validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
    balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]
    # Randomness
    randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]
    # Slashings
    slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]  # Per-epoch sums of slashed effective balances
    # Participation
    previous_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT]  # [Modified in Altair]
    current_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT]  # [Modified in Altair]
    # Finality
    justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]  # Bit set for every recent justified epoch
    previous_justified_checkpoint: Checkpoint
    current_justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint
    # Inactivity
    inactivity_scores: List[uint64, VALIDATOR_REGISTRY_LIMIT]  # [New in Altair]
    # Sync
    current_sync_committee: SyncCommittee  # [New in Altair]
    next_sync_committee: SyncCommittee  # [New in Altair]
    # Execution
    latest_execution_payload_header: ExecutionPayloadHeader  # [New in Bellatrix]
    # Withdrawals
    next_withdrawal_index: WithdrawalIndex  # [New in Capella]
    next_withdrawal_validator_index: ValidatorIndex  # [New in Capella]
    # Deep history valid from Capella onwards
    historical_summaries: List[HistoricalSummary, HISTORICAL_ROOTS_LIMIT]  # [New in Capella]

All roads lead to the BeaconState. Maintaining this data structure is the sole purpose of all the apparatus in all the spec documents. This state is the focus of consensus among the beacon nodes; it is what everybody, eventually, must agree on.

The beacon chain's state is monolithic: everything is bundled into a single state object (sometimes referred to as the "God object"). Some have argued for more granular approaches that might be more efficient, but at least the current approach is simple.

Let's break this thing down.

    # Versioning
    genesis_time: uint64
    genesis_validators_root: Root
    slot: Slot
    fork: Fork

How do we know which chain we're on, and where we are on it? The information here ought to be sufficient. A continuous path back to the genesis block would also suffice.

genesis_validators_root is calculated at Genesis time (when the chain starts) and is fixed for the life of the chain. This, combined with the fork identifier, should serve to uniquely identify the chain that we are on.

genesis_time is used by the fork choice rule to work out what slot we're in, and (since The Merge) to validate execution payloads.

The values of these two fields is fixed for the life of the chain. For the mainnet beacon chain they have the following values:

genesis_time 1606824023
genesis_validators_root 0x4b363db94e286120d76eb905340fdd4e54bfe9f06bf33ff6cf5ad27f511bfe95

The fork object is manually updated as part of beacon chain upgrades, also called hard forks. This invalidates blocks and attestations from validators not following the new fork.

Since the Capella fork, the fork field has contained the following values:

previous_version 0x02000000
current_version 0x03000000
epoch 194048

Historical info on fork versions and upgrade timing is in the Upgrade History chapter.

    # History
    latest_block_header: BeaconBlockHeader
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]  # Frozen in Capella, replaced by historical_summaries

latest_block_header is only used to make sure that the next block we process is a direct descendent of the previous block. It's a blockchain thing.

Past block_roots and state_roots are stored in the lists here until the lists are full. Before the Capella upgrade, once the lists were full, they were Merkleized together and the root appended to historical_roots. Since Capella, they are now Merkleized separately and appended to historical_summaries (see below). The historical_roots list is now frozen and continues to exist only to allow proofs to be made against pre-Capella blocks and states.

    # Eth1
    eth1_data: Eth1Data
    eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
    eth1_deposit_index: uint64

eth1_data is the latest agreed upon state of the Eth1 chain and deposit contract. eth1_data_votes accumulates Eth1Data from blocks until there is an overall majority in favour of one Eth1 state. If a majority is not achieved by the time the list is full then it is cleared down and voting starts again from scratch. eth1_deposit_index is the total number of deposits that have been processed by the beacon chain (which is greater than or equal to the number of validators, as a deposit can top up the balance of an existing validator).

    # Registry
    validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
    balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]

The registry of Validators and their balances. The balances list is separated out as it changes much more frequently than the validators list. Roughly speaking, balances of active validators are updated at least once per epoch, while the validators list has only minor updates per epoch. When combined with SSZ tree hashing, this results in a big saving in the amount of data to be rehashed on registry updates. See also validator inactivity scores under Inactivity which we treat similarly.

    # Randomness
    randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]

Past randao mixes are stored in a fixed-size circular list for EPOCHS_PER_HISTORICAL_VECTOR epochs (~290 days). These can be used to recalculate past committees, which allows slashing of historical attestations. See EPOCHS_PER_HISTORICAL_VECTOR for more information.

    # Slashings
    slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]

A fixed-size circular list of past slashed amounts. Each epoch, the total effective balance of all validators slashed in that epoch is stored as an entry in this list. When the final slashing penalty for a slashed validator is calculated, it is weighted with the sum of this list. This mechanism is designed to less heavily penalise one-off slashings that are most likely accidental, and more heavily penalise mass slashings during a window of time, which are more likely to be a coordinated attack.

    # Participation
    previous_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT]  # [Modified in Altair]
    current_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT]  # [Modified in Altair]

These lists record which validators participated in attesting during the current and previous epochs by recording flags for timely votes for the correct source, the correct target and the correct head. We store two epochs' worth since Validators have up to 32 slots to include a correct target vote. The flags are used to calculate finality and to assign rewards at the end of epochs.

Previously, during Phase 0, we stored two epochs' worth of actual attestations in the state and processed them en masse at the end of epochs. This was slow, and was thought to be contributing to observed late block production in the first slots of epochs. The change to the new scheme was implemented in the Altair upgrade under the title of Accounting Reforms.

    # Finality
    justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]
    previous_justified_checkpoint: Checkpoint
    current_justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint

Ethereum 2.0 uses the Casper FFG finality mechanism, with a k-finality optimisation, where k = 2. The above objects in the state are the data that need to be tracked in order to apply the finality rules.

  • justification_bits is only four bits long. It tracks the justification status of the last four epochs: 1 if justified, 0 if not. This is used when calculating whether we can finalise an epoch.
  • Outside the finality calculations, previous_justified_checkpoint and current_justified_checkpoint are used to filter attestations: valid blocks include only attestations with a source checkpoint that matches the justified checkpoint in the state for the attestation's epoch.
  • finalized_checkpoint: the network has agreed that the beacon chain state at or before that epoch will never be reverted. So, for one thing, the fork choice rule doesn't need to go back any further than this. The Casper FFG mechanism is specifically constructed so that two conflicting finalized checkpoints cannot be created without at least one third of validators being slashed.

    # Inactivity
    inactivity_scores: List[uint64, VALIDATOR_REGISTRY_LIMIT]  # [New in Altair]

This is logically part of "Registry", above, and would be better placed there. It is a per-validator record of inactivity scores that is updated every epoch. This list is stored outside the main list of Validator objects since it is updated very frequently. See the Registry for more explanation.

    # Sync
    current_sync_committee: SyncCommittee  # [New in Altair]
    next_sync_committee: SyncCommittee  # [New in Altair]

Sync committees were introduced in the Altair upgrade. The next sync committee is calculated and stored so that participating validators can prepare in advance by subscribing to the required p2p subnets.

    # Execution
    latest_execution_payload_header: ExecutionPayloadHeader  # [New in Bellatrix]

Since the Merge, the header of the most recent execution payload is cached in the beacon state. This serves two functions for now, though possibly more in future. First, it allows the chain to check whether the Merge has been completed or not. See is_merge_transition_complete(). Second, it allows the beacon chain to check that the execution chain is unbroken when processing a new execution payload. See process_execution_payload().

    # Withdrawals
    next_withdrawal_index: WithdrawalIndex  # [New in Capella]
    next_withdrawal_validator_index: ValidatorIndex  # [New in Capella]

Automatic validator balance withdrawals were added in the Capella upgrade. The next_withdrawal_index maintains a count of the total number of withdrawal transactions performed so far, while next_withdrawal_validator_index cycles through the validator registry to keep track of which validator should be considered for a withdrawal next. Validators are considered for withdrawals consecutively in order of their validator indices, and the withdrawals sweep wraps round to zero after considering the highest-numbered validator. A maximum of MAX_WITHDRAWALS_PER_PAYLOAD withdrawals may be made per block.

    # Deep history valid from Capella onwards
    historical_summaries: List[HistoricalSummary, HISTORICAL_ROOTS_LIMIT]  # [New in Capella]

Hash tree roots of state.block_roots and state.state_roots are periodically added to historical_summaries every SLOTS_PER_HISTORICAL_ROOT slots as part of the protocol's double batched accumulator. The work is done by process_historical_summaries_update().

The state.historical_summaries list was introduced in the Capella upgrade and functionally replaces the state.historical_roots list that's now frozen (see above). It uses the HistoricalSummary container, which is twice as big as a Root type (64 bytes per item rather than 32). The list will effectively grow without bound (HISTORICAL_ROOTS_LIMIT is large), but at a rate of only 20 KB per year. Keeping this data is useful for light clients, and also allows Merkle proofs to be created against past states, for example historical deposit data.

Historical Note

There was a period during which beacon state was split into "crystallized state" and "active state". The active state was constantly changing; the crystallized state changed only once per epoch (or what passed for epochs back then). Separating out the fast-changing state from the slower-changing state was an attempt to avoid having to constantly rehash the whole state every slot. With the introduction of SSZ tree hashing, this was no longer necessary, as the roots of the slower changing parts could simply be cached, which was a nice simplification. There remains an echo of this approach, however, in the splitting out of validator balances and inactivity scores into different structures withing the beacon state.

Execution

ExecutionPayload

class ExecutionPayload(Container):
    # Execution block header fields
    parent_hash: Hash32
    fee_recipient: ExecutionAddress  # 'beneficiary' in the yellow paper
    state_root: Bytes32
    receipts_root: Bytes32
    logs_bloom: ByteVector[BYTES_PER_LOGS_BLOOM]
    prev_randao: Bytes32  # 'difficulty' in the yellow paper
    block_number: uint64  # 'number' in the yellow paper
    gas_limit: uint64
    gas_used: uint64
    timestamp: uint64
    extra_data: ByteList[MAX_EXTRA_DATA_BYTES]
    base_fee_per_gas: uint256
    # Extra payload fields
    block_hash: Hash32  # Hash of execution block
    transactions: List[Transaction, MAX_TRANSACTIONS_PER_PAYLOAD]
    withdrawals: List[Withdrawal, MAX_WITHDRAWALS_PER_PAYLOAD]  # [New in Capella]

Since the Merge, blocks on the beacon chain contain Ethereum transaction data, formerly known as Eth1 blocks, and now called execution payloads.

This is a significant change, and is what led to the name "The Merge".

  • Pre-Merge, there were two types of block in the Ethereum system:
    • Eth1 blocks contained users' transactions and were gossiped between Eth1 nodes;
    • Eth2 blocks (beacon blocks) contained only consensus information and were gossiped between Eth2 nodes.
  • Post-Merge, there is only one kind of block, the merged beacon block:
    • Beacon blocks contain execution payloads that in turn contain users' transactions. These blocks are gossiped only between consensus (Eth2) nodes.

The ExecutionPayload is contained in the BeaconBlock structure.

The fields of ExecutionPayload mostly reflect the old structure of Eth1 blocks as described in Ethereum's Yellow Paper7, section 4.3. Differences from the Eth1 block structure are noted in the comments.

The execution payload differs from an old Eth1 block in the following respects:

  • ommersHash (also known as uncle_hashes), difficulty, mixHash and nonce were not carried over from Eth1 blocks as they were specific to the proof of work mechanism.
  • fee_recipient is the Ethereum account address that will receive the unburnt portion of the transaction fees (the priority fees). This has been called various things at various times: the original Yellow Paper calls it beneficiary; EIP-1559 calls it author. In any case, the proposer of the block sets the fee_recipient to specify where the appropriate transaction fees for the block are to be sent. Under proof of work this was the same address as the COINBASE address that received the block reward. Under proof of stake, the block reward is credited to the validator's beacon chain balance, and the transaction fees are credited to the fee_recipient Ethereum address.
  • prev_randao replaces difficulty. The Eth1 chain did not have access to good quality randomness. Sometimes the block hash or difficulty of the block were used to seed randomness, but these were low quality. The prev_randao field gives the execution layer access to the beacon chain's randomness. This is better, but still not of cryptographic quality.
  • block_number in the execution layer is the block height in that chain, picking up from the Eth1 block height at the Merge. It increments by one for every beacon block produced. The beacon chain itself does not track block height, only slot number, which can differ from block height due to empty slots.
  • The execution payload block_hash is included. The consensus layer does not know how to calculate the root hashes of execution blocks, but needs access to them when checking that the execution chain is unbroken during execution payload processing.
  • Despite being flagged in the comments as an "extra payload field", a list of transactions was always part of Eth1 blocks. However, the list of ommers/uncles is no longer present.

Individual transactions are represented by the Transaction custom type. There can be up to MAX_TRANSACTIONS_PER_PAYLOAD of them in a single execution payload. The values of MAX_BYTES_PER_TRANSACTION and MAX_TRANSACTIONS_PER_PAYLOAD are huge, and suggest that an execution payload could be up to a petabyte in size. These sizes are specified only because SSZ List types require them. They will occupy only the minimum necessary space in practice.

The withdrawals field was added in the Capella upgrade. Withdrawal transactions are unusual in that they affect state on both the consensus side and the execution side. Uniquely, withdrawal transactions are the only data that's generated by the consensus layer but only communicated between nodes in execution payloads.

ExecutionPayloadHeader

class ExecutionPayloadHeader(Container):
    # Execution block header fields
    parent_hash: Hash32
    fee_recipient: ExecutionAddress
    state_root: Bytes32
    receipts_root: Bytes32
    logs_bloom: ByteVector[BYTES_PER_LOGS_BLOOM]
    prev_randao: Bytes32
    block_number: uint64
    gas_limit: uint64
    gas_used: uint64
    timestamp: uint64
    extra_data: ByteList[MAX_EXTRA_DATA_BYTES]
    base_fee_per_gas: uint256
    # Extra payload fields
    block_hash: Hash32  # Hash of execution block
    transactions_root: Root
    withdrawals_root: Root  # [New in Capella]

The same as ExecutionPayload but with the transactions represented only by their root. By the magic of Merkleization, the hash tree root of an ExecutionPayloadHeader will be the same as the hash tree root of its corresponding ExecutionPayload.

The most recent ExecutionPayloadHeader is stored in the beacon state.

Signed envelopes

The following are just wrappers for more basic types, with an added signature.

SignedVoluntaryExit

class SignedVoluntaryExit(Container):
    message: VoluntaryExit
    signature: BLSSignature

A voluntary exit is currently signed with the validator's online signing key.

There has been some discussion about changing this to also allow signing of a voluntary exit with the validator's offline withdrawal key. The introduction of multiple types of withdrawal credential makes this more complex, however, and it is no longer likely to be practical.

SignedBeaconBlock

class SignedBeaconBlock(Container):
    message: BeaconBlock
    signature: BLSSignature

BeaconBlocks are signed by the block proposer and unwrapped for block processing.

This signature is what makes proposing a block "accountable". If two correctly signed conflicting blocks turn up, the signatures guarantee that the same proposer produced them both, and is therefore subject to being slashed. This is also why stakers need to closely guard their signing keys.

SignedBeaconBlockHeader

class SignedBeaconBlockHeader(Container):
    message: BeaconBlockHeader
    signature: BLSSignature

This is used only when reporting proposer slashing, within a ProposerSlashing object.

Through the magic of SSZ hash tree roots, a valid signature for a SignedBeaconBlock is also a valid signature for a SignedBeaconBlockHeader. Proposer slashing makes use of this to save space in slashing reports.

SignedBLSToExecutionChange

class SignedBLSToExecutionChange(Container):
    message: BLSToExecutionChange
    signature: BLSSignature

A message requesting a change from BLS withdrawal credentials to Eth1 withdrawal credentials.

Uniquely, this message is signed with the validator's withdrawal key rather than its usual signing key. Only validators that made deposits with 0x00 BLS credentials have a withdrawal key, and it will usually be different from the signing key (although it may be derived from the same mnemonic).

Helper Functions

Preamble

Note: The definitions below are for specification purposes and are not necessarily optimal implementations.

This note in the spec is super important for implementers! There are many, many optimisations of the below routines that are being used in practice; a naive implementation would be impractically slow for mainnet configurations. As long as the optimised code produces identical results to the code here, then all is fine.

Math

integer_squareroot

def integer_squareroot(n: uint64) -> uint64:
    """
    Return the largest integer ``x`` such that ``x**2 <= n``.
    """
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

Validator rewards scale with the reciprocal of the square root of the total active balance of all validators. This is calculated in get_base_reward_per_increment().

In principle integer_squareroot is also used in get_attestation_participation_flag_indices(), to specify the maximum delay for source votes to receive a reward. But this is just the constant, integer_squareroot(SLOTS_PER_EPOCH), which is 5.

Newton's method is used which has pretty good convergence properties, but implementations may use any method that gives identical results.

Used by get_base_reward_per_increment(), get_attestation_participation_flag_indices()

xor

def xor(bytes_1: Bytes32, bytes_2: Bytes32) -> Bytes32:
    """
    Return the exclusive-or of two 32-byte strings.
    """
    return Bytes32(a ^ b for a, b in zip(bytes_1, bytes_2))

The bitwise xor of two 32-byte quantities is defined here in Python terms.

This is used only in process_randao() when mixing in the new randao reveal.

Fun fact: if you xor two byte types in Java, the result is a 32 bit (signed) integer. This is one reason we need to define the "obvious" here. But mainly, because the spec is executable, we need to tell Python what it doesn't already know.

Used by process_randao()

uint_to_bytes

def uint_to_bytes(n: uint) -> bytes is a function for serializing the uint type object to bytes in ENDIANNESS-endian. The expected length of the output is the byte-length of the uint type.

For the most part, integers are integers and bytes are bytes, and they don't mix much. But there are a few places where we need to convert from integers to bytes:

You'll note that in every case, the purpose of the conversion is for the integer to form part of a byte string that is hashed to create (pseudo-)randomness.

The result of this conversion is dependent on our arbitrary choice of endianness, that is, how we choose to represent integers as strings of bytes. For Eth2, we have chosen little-endian: see the discussion of ENDIANNESS for more background.

The uint_to_bytes() function is not given an explicit implementation in the specification, which is unusual. This to avoid exposing the innards of the Python SSZ implementation (of uint) to the rest of the spec. When running the spec as an executable, it uses the definition in the SSZ utilities.

Used by compute_shuffled_index(), compute_proposer_index(), get_seed(), get_beacon_proposer_index(), get_next_sync_committee_indices()
See also ENDIANNESS, SSZ utilities

bytes_to_uint64

def bytes_to_uint64(data: bytes) -> uint64:
    """
    Return the integer deserialization of ``data`` interpreted as ``ENDIANNESS``-endian.
    """
    return uint64(int.from_bytes(data, ENDIANNESS))

bytes_to_uint64() is the inverse of uint_to_bytes(), and is used by the shuffling algorithm to create a random index from the output of a hash.

It is also used in the validator specification when selecting validators to aggregate attestations, and sync committee messages.

int.from_bytes is a built-in Python 3 method. The uint64 cast is provided by the spec's SSZ implementation.

Used by compute_shuffled_index
See also attestation aggregator selection, sync committee aggregator selection

Crypto

hash

def hash(data: bytes) -> Bytes32 is SHA256.

SHA256 was chosen as the protocol's base hash algorithm for easier cross-chain interoperability: many other chains use SHA256, and Eth1 has a SHA256 precompile.

There was a lot of discussion about this choice early in the design process. The original plan had been to use the BLAKE2b-512 hash function – that being a modern hash function that's faster than SHA3 – and to move to a STARK/SNARK friendly hash function at some point (such as MiMC). However, to keep interoperability with Eth1, in particular for the implementation of the deposit contract, the hash function was changed to Keccak256. Finally, we settled on SHA256 as having even broader compatibility.

The hash function serves two purposes within the protocol. The main use, computationally, is in Merkleization, the computation of hash tree roots, which is ubiquitous in the protocol. Its other use is to harden the randomness used in various places.

Used by hash_tree_root, is_valid_merkle_branch(), compute_shuffled_index(), compute_proposer_index(), get_seed(), get_beacon_proposer_index(), get_next_sync_committee_indices(), process_randao()

hash_tree_root

def hash_tree_root(object: SSZSerializable) -> Root is a function for hashing objects into a single root by utilizing a hash tree structure, as defined in the SSZ spec.

The development of the tree hashing process was transformational for the Ethereum 2.0 specification, and it is now used everywhere.

The naive way to create a digest of a data structure is to serialise it and then just run a hash function over the result. In tree hashing, the basic idea is to treat each element of an ordered, compound data structure as the leaf of a Merkle tree, recursively if necessary until a primitive type is reached, and to return the Merkle root of the resulting tree.

At first sight, this all looks quite inefficient. Twice as much data needs to be hashed when tree hashing, and actual speeds are 4-6 times slower compared with the linear hash. However, it is good for supporting light clients, because it allows Merkle proofs to be constructed easily for subsets of the full state.

The breakthrough insight was realising that much of the re-hashing work can be cached: if part of the state data structure has not changed, that part does not need to be re-hashed: the whole subtree can be replaced with its cached hash. This turns out to be a huge efficiency boost, allowing the previous design, with cumbersome separate crystallised and active state, to be simplified into a single state object.

Merkleization, the process of calculating the hash_tree_root() of an object, is defined in the SSZ specification, and explained further in the section on SSZ.

BLS signatures

See the main write-up on BLS Signatures for a more in-depth exploration of this topic.

The IETF BLS signature draft standard v4 with ciphersuite BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_ defines the following functions:

  • def Sign(privkey: int, message: Bytes) -> BLSSignature
  • def Verify(pubkey: BLSPubkey, message: Bytes, signature: BLSSignature) -> bool
  • def Aggregate(signatures: Sequence[BLSSignature]) -> BLSSignature
  • def FastAggregateVerify(pubkeys: Sequence[BLSPubkey], message: Bytes, signature: BLSSignature) -> bool
  • def AggregateVerify(pubkeys: Sequence[BLSPubkey], messages: Sequence[Bytes], signature: BLSSignature) -> bool
  • def KeyValidate(pubkey: BLSPubkey) -> bool

The above functions are accessed through the bls module, e.g. bls.Verify.

The detailed specification of the cryptographic functions underlying Ethereum 2.0's BLS signing scheme is delegated to the draft IRTF standard8 as described in the spec. This includes specifying the elliptic curve BLS12-381 as our domain of choice.

Our intention in conforming to the in-progress standard is to provide for maximal interoperability with other chains, applications, and cryptographic libraries. Ethereum Foundation researchers and Eth2 developers had input to the development of the standard. Nevertheless, there were some challenges involved in trying to keep up as the standard evolved. For example, the Hashing to Elliptic Curves standard was still changing rather late in the beacon chain testing phase. In the end, everything worked out fine.

The following two functions are described in the separate BLS Extensions document, but included here for convenience.

eth_aggregate_pubkeys

def eth_aggregate_pubkeys(pubkeys: Sequence[BLSPubkey]) -> BLSPubkey:
    """
    Return the aggregate public key for the public keys in ``pubkeys``.

    NOTE: the ``+`` operation should be interpreted as elliptic curve point addition, which takes as input
    elliptic curve points that must be decoded from the input ``BLSPubkey``s.
    This implementation is for demonstrative purposes only and ignores encoding/decoding concerns.
    Refer to the BLS signature draft standard for more information.
    """
    assert len(pubkeys) > 0
    # Ensure that the given inputs are valid pubkeys
    assert all(bls.KeyValidate(pubkey) for pubkey in pubkeys)

    result = copy(pubkeys[0])
    for pubkey in pubkeys[1:]:
        result += pubkey
    return result

Stand-alone aggregation of public keys is not defined by the BLS signature standard. In the standard, public keys are aggregated only in the context of performing an aggregate signature verification via AggregateVerify() or FastAggregateVerify().

The eth_aggregate_pubkeys() function was added in the Altair upgrade to implement an optimisation for light clients when verifying the signatures on SyncAggregates.

Used by get_next_sync_committee()
Uses bls.KeyValidate()

eth_fast_aggregate_verify

def eth_fast_aggregate_verify(pubkeys: Sequence[BLSPubkey], message: Bytes32, signature: BLSSignature) -> bool:
    """
    Wrapper to ``bls.FastAggregateVerify`` accepting the ``G2_POINT_AT_INFINITY`` signature when ``pubkeys`` is empty.
    """
    if len(pubkeys) == 0 and signature == G2_POINT_AT_INFINITY:
        return True
    return bls.FastAggregateVerify(pubkeys, message, signature)

The specification of FastAggregateVerify() in the BLS signature standard returns INVALID if there are zero public keys given.

This function was introduced in Altair to handle SyncAggregates that no sync committee member had signed off on, in which case the G2_POINT_AT_INFINITY can be considered a "correct" signature (in our case, but not according to the standard).

The networking and validator specs were later clarified to require that SyncAggregates have at least one signature. But this requirement is not enforced in the consensus layer (in process_sync_aggregate()), so we need to retain this eth_fast_aggregate_verify() wrapper to allow the empty signature to be valid.

Used by process_sync_aggregate()
Uses FastAggregateVerify()
See also G2_POINT_AT_INFINITY

Predicates

is_active_validator

def is_active_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is active.
    """
    return validator.activation_epoch <= epoch < validator.exit_epoch

Validators don't explicitly track their own state (eligible for activation, active, exited, withdrawable - the sole exception being whether they have been slashed or not). Instead, a validator's state is calculated by looking at the fields in the Validator record that store the epoch numbers of state transitions.

In this case, if the validator was activated in the past and has not yet exited, then it is active.

This is used a few times in the spec, most notably in get_active_validator_indices() which returns a list of all active validators at an epoch.

Used by get_active_validator_indices(), get_eligible_validator_indices(), process_registry_updates(), process_voluntary_exit()
See also Validator

is_eligible_for_activation_queue

def is_eligible_for_activation_queue(validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible to be placed into the activation queue.
    """
    return (
        validator.activation_eligibility_epoch == FAR_FUTURE_EPOCH
        and validator.effective_balance == MAX_EFFECTIVE_BALANCE
    )

When a deposit is processed with a previously unseen public key, a new Validator record is created with all the state-transition fields set to the default value of FAR_FUTURE_EPOCH.

It is possible to deposit any amount over MIN_DEPOSIT_AMOUNT (currently 1 Ether) into the deposit contract. However, validators do not become eligible for activation until their effective balance is equal to MAX_EFFECTIVE_BALANCE, which corresponds to an actual balance of 32 Ether or more.

This predicate is used during epoch processing to find validators that have acquired the minimum necessary balance, but have not yet been added to the queue for activation. These validators are then marked as eligible for activation by setting the validator.activation_eligibility_epoch to the next epoch.

Used by process_registry_updates()
See also Validator, FAR_FUTURE_EPOCH, MAX_EFFECTIVE_BALANCE

is_eligible_for_activation

def is_eligible_for_activation(state: BeaconState, validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible for activation.
    """
    return (
        # Placement in queue is finalized
        validator.activation_eligibility_epoch <= state.finalized_checkpoint.epoch
        # Has not yet been activated
        and validator.activation_epoch == FAR_FUTURE_EPOCH
    )

A validator that is_eligible_for_activation() has had its activation_eligibility_epoch set, but its activation_epoch is not yet set.

To avoid any ambiguity or confusion on the validator side about its state, we wait until its eligibility activation epoch has been finalised before adding it to the activation queue by setting its activation_epoch. Otherwise, it might at one point become active, and then the beacon chain could flip to a fork in which it is not active. This could happen if the latter fork had fewer blocks and had thus processed fewer deposits.

Note that state.finalized_checkpoint.epoch does not mean that all of the slots in that epoch are finalised. We finalise checkpoints, not epochs, so only the first slot (the checkpoint) of that epoch is finalised. This is accounted for in process_registry_updates() by adding one to the current epoch when setting the validator.activation_eligibility_epoch, so that we can be sure that the block containing the deposit has been finalised.9

Used by process_registry_updates()
See also Validator, FAR_FUTURE_EPOCH

is_slashable_validator

def is_slashable_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is slashable.
    """
    return (not validator.slashed) and (validator.activation_epoch <= epoch < validator.withdrawable_epoch)

Validators can be slashed only once: the flag validator.slashed is set when the first correct slashing report for the validator is processed.

An unslashed validator remains eligible to be slashed from when it becomes active right up until it becomes withdrawable. This is MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs (around 27 hours) after it has exited from being a validator and ceased validation duties.

Used by process_proposer_slashing(), process_attester_slashing()
See also Validator

is_slashable_attestation_data

def is_slashable_attestation_data(data_1: AttestationData, data_2: AttestationData) -> bool:
    """
    Check if ``data_1`` and ``data_2`` are slashable according to Casper FFG rules.
    """
    return (
        # Double vote
        (data_1 != data_2 and data_1.target.epoch == data_2.target.epoch) or
        # Surround vote
        (data_1.source.epoch < data_2.source.epoch and data_2.target.epoch < data_1.target.epoch)
    )

This predicate is used by process_attester_slashing() to check that the two sets of alleged conflicting attestation data in an AttesterSlashing do in fact qualify as slashable.

There are two ways for validators to get slashed under Casper FFG:

  1. A double vote: voting more than once for the same target epoch, or
  2. A surround vote: the source–target interval of one attestation entirely contains the source–target interval of a second attestation from the same validator or validators. The reporting block proposer needs to take care to order the IndexedAttestations within the AttesterSlashing object so that the first set of votes surrounds the second. (The opposite ordering also describes a slashable offence, but is not checked for here, so the order of the arguments matters.)

It is far from obvious, but this predicate also enforces LMD GHOST slashing for attestation equivocation. The AttestationData objects contain the LMD GHOST head vote (beacon_block_root) as well as the Casper FFG votes. So, the Casper FFG checkpoint votes might be identical and non-slashable, but if the LMD GHOST vote differs between the two attestations then it will be deemed slashable.

Used by process_attester_slashing()
See also AttestationData, AttesterSlashing

is_valid_indexed_attestation

def is_valid_indexed_attestation(state: BeaconState, indexed_attestation: IndexedAttestation) -> bool:
    """
    Check if ``indexed_attestation`` is not empty, has sorted and unique indices and has a valid aggregate signature.
    """
    # Verify indices are sorted and unique
    indices = indexed_attestation.attesting_indices
    if len(indices) == 0 or not indices == sorted(set(indices)):
        return False
    # Verify aggregate signature
    pubkeys = [state.validators[i].pubkey for i in indices]
    domain = get_domain(state, DOMAIN_BEACON_ATTESTER, indexed_attestation.data.target.epoch)
    signing_root = compute_signing_root(indexed_attestation.data, domain)
    return bls.FastAggregateVerify(pubkeys, signing_root, indexed_attestation.signature)

is_valid_indexed_attestation() is used in attestation processing and attester slashing.

IndexedAttestations differ from Attestations in that the latter record the contributing validators in a bitlist and the former explicitly list the global indices of the contributing validators.

An IndexedAttestation passes this validity test only if all the following apply.

  1. There is at least one validator index present.
  2. The list of validators contains no duplicates (the Python set function performs deduplication).
  3. The indices of the validators are sorted. (It is not clear to me why this is required. It's used in the duplicate check here, but that could just be replaced by checking the set size.)
  4. Its aggregated signature verifies against the aggregated public keys of the listed validators.

Verifying the signature uses the magic of aggregated BLS signatures. The indexed attestation contains a BLS signature that is supposed to be the combined individual signatures of each of the validators listed in the attestation. This is verified by passing it to bls.FastAggregateVerify() along with the list of public keys from the same validators. The verification succeeds only if exactly the same set of validators signed the message (signing_root) as appear in the list of public keys. Note that get_domain() mixes in the fork version, so that attestations are not valid across forks.

No check is done here that the attesting_indices (which are the global validator indices) are all members of the correct committee for this attestation. In process_attestation() they must be, by construction. In process_attester_slashing() it doesn't matter: any validator signing conflicting attestations is liable to be slashed.

Used by process_attester_slashing(), process_attestation()
Uses get_domain(), compute_signing_root(), bls.FastAggregateVerify()
See also IndexedAttestation, Attestation

is_valid_merkle_branch

def is_valid_merkle_branch(leaf: Bytes32, branch: Sequence[Bytes32], depth: uint64, index: uint64, root: Root) -> bool:
    """
    Check if ``leaf`` at ``index`` verifies against the Merkle ``root`` and ``branch``.
    """
    value = leaf
    for i in range(depth):
        if index // (2**i) % 2:
            value = hash(branch[i] + value)
        else:
            value = hash(value + branch[i])
    return value == root

This is the classic algorithm for verifying a Merkle branch (also called a Merkle proof). Nodes are iteratively hashed as the tree is traversed from leaves to root. The bits of index select whether we are the right or left child of our parent at each level. The result should match the given root of the tree.

In this way we prove that we know that leaf is the value at position index in the list of leaves, and that we know the whole structure of the rest of the tree, as summarised in branch.

We use this function in process_deposit() to check whether the deposit data we've received is correct or not. Based on the deposit data they have seen, Eth2 clients build a replica of the Merkle tree of deposits in the deposit contract. The proposer of the block that includes the deposit constructs the Merkle proof using its view of the deposit contract, and all other nodes use is_valid_merkle_branch() to check that their view matches the proposer's. If any deposit fails Merkle branch verification then the entire block is invalid.

Used by process_deposit()

is_merge_transition_complete

def is_merge_transition_complete(state: BeaconState) -> bool:
    return state.latest_execution_payload_header != ExecutionPayloadHeader()

A simple test for whether the given beacon state is pre- or post-Merge. If the latest_execution_payload_header in the state is the default ExecutionPayloadHeader then the chain is pre-Merge, otherwise it is post-Merge. Upgrades normally occur at a predetermined block height (or epoch number on the beacon chain), and that's the usual way to test for them. The block height of the Merge, however, was unknown ahead of time, so a different kind of test was required.

Although the mainnet beacon chain is decidedly post-Merge now, this remains useful for syncing nodes from pre-Merge starting points.

This function was added in the Bellatrix pre-Merge upgrade.

Used by process_execution_payload(), is_merge_transition_block(), is_execution_enabled()
See also ExecutionPayloadHeader

is_merge_transition_block

def is_merge_transition_block(state: BeaconState, body: BeaconBlockBody) -> bool:
    return not is_merge_transition_complete(state) and body.execution_payload != ExecutionPayload()

If the Merge transition is not complete (meaning that the beacon state still has the default execution payload header in it), yet our block has a non-default execution payload, then this must be the first block we've seen with an execution payload. It is therefore the Merge transition block.

This function was added in the Bellatrix pre-Merge upgrade.

Uses is_merge_transition_complete()
Used by is_execution_enabled(), on_block() (Bellatrix version)
See also ExecutionPayload

is_execution_enabled

def is_execution_enabled(state: BeaconState, body: BeaconBlockBody) -> bool:
    return is_merge_transition_block(state, body) or is_merge_transition_complete(state)

If the block that we have is the first block with an execution payload (the Merge transition block), or we know from the state that we have previously seen a block with an execution payload then execution is enabled, the execution and consensus chains have Merged.

This function was added in the Bellatrix pre-Merge upgrade.

Uses is_merge_transition_block(), is_merge_transition_complete()
Used by process_block()

has_eth1_withdrawal_credential

def has_eth1_withdrawal_credential(validator: Validator) -> bool:
    """
    Check if ``validator`` has an 0x01 prefixed "eth1" withdrawal credential.
    """
    return validator.withdrawal_credentials[:1] == ETH1_ADDRESS_WITHDRAWAL_PREFIX

Only validators that have Eth1 withdrawal credentials are eligible for balance withdrawals of any sort.

Used by is_fully_withdrawable_validator(), is_partially_withdrawable_validator()
See also ETH1_ADDRESS_WITHDRAWAL_PREFIX

is_fully_withdrawable_validator

def is_fully_withdrawable_validator(validator: Validator, balance: Gwei, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is fully withdrawable.
    """
    return (
        has_eth1_withdrawal_credential(validator)
        and validator.withdrawable_epoch <= epoch
        and balance > 0
    )

A validator is fully withdrawable only when (a) it has an Eth1 withdrawal credential to make the withdrawal to, (b) it has become withdrawable, meaning that its exit has been processed and it has passed through its MIN_VALIDATOR_WITHDRAWABILITY_DELAY period, and (c) it has a nonzero balance.

Uses has_eth1_withdrawal_credential()
Used by get_expected_withdrawals()

is_partially_withdrawable_validator

def is_partially_withdrawable_validator(validator: Validator, balance: Gwei) -> bool:
    """
    Check if ``validator`` is partially withdrawable.
    """
    has_max_effective_balance = validator.effective_balance == MAX_EFFECTIVE_BALANCE
    has_excess_balance = balance > MAX_EFFECTIVE_BALANCE
    return has_eth1_withdrawal_credential(validator) and has_max_effective_balance and has_excess_balance

A partial withdrawal is the withdrawal of excess Ether from an active (non-exited) validator.

A validator has excess Ether only when (a) it's effective balance is at MAX_EFFECTIVE_BALANCE, (b) its actual balance is greater than MAX_EFFECTIVE_BALANCE, and (c) it has an Eth1 withdrawal credential to make the withdrawal to.

The first of these conditions is related to the hysteresis in the effective balance. If a validator has previously suffered a drop in its balance, it's effective balance might be 31 Ether even while its actual balance is greater than 32 Ether. If we were to start skimming withdrawals in this situation, the validator's balance would never reach the 32.25 Ether necessary to bring its effective balance up to 32 Ether, and it would be forever stuck at 31 ETH. Therefore, only validators with the full effective balance are eligible for the excess to be withdrawn.

Used by get_expected_withdrawals()
Uses has_eth1_withdrawal_credential()
See also MAX_EFFECTIVE_BALANCE, hysteresis

Misc

compute_shuffled_index

def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> uint64:
    """
    Return the shuffled index corresponding to ``seed`` (and ``index_count``).
    """
    assert index < index_count

    # Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf)
    # See the 'generalized domain' algorithm on page 3
    for current_round in range(SHUFFLE_ROUND_COUNT):
        pivot = bytes_to_uint64(hash(seed + uint_to_bytes(uint8(current_round)))[0:8]) % index_count
        flip = (pivot + index_count - index) % index_count
        position = max(index, flip)
        source = hash(
            seed
            + uint_to_bytes(uint8(current_round))
            + uint_to_bytes(uint32(position // 256))
        )
        byte = uint8(source[(position % 256) // 8])
        bit = (byte >> (position % 8)) % 2
        index = flip if bit else index

    return index

Selecting random, distinct committees of validators is a big part of Ethereum 2.0; it is foundational for both its scalability and security. This selection is done by shuffling.

Shuffling a list of objects is a well understood problem in computer science. Notice, however, that this routine manages to shuffle a single index to a new location, knowing only the total length of the list. To use the technical term for this, it is oblivious. To shuffle the whole list, this routine needs to be called once per validator index in the list. By construction, each input index maps to a distinct output index. Thus, when applied to all indices in the list, it results in a permutation, also called a shuffling.

Why do this rather than a simpler, more efficient, conventional shuffle? It's all about light clients. Beacon nodes will generally need to know the whole shuffling, but light clients will often be interested only in a small number of committees. Using this technique allows the composition of a single committee to be calculated without having to shuffle the entire set: potentially a big saving on time and memory.

As stated in the code comments, this is an implementation of the "swap-or-not" shuffle, described in the cited paper. Vitalik kicked off a search for a shuffle with these properties in late 2018. With the help of Professor Dan Boneh of Stanford University, the swap-or-not was identified as a candidate a couple of months later, and adopted into the spec.

The algorithm breaks down as follows. For each iteration (each round), we start with a current index.

  1. Pseudo-randomly select a pivot. This is a 64-bit integer based on the seed and current round number. This domain is large enough that any non-uniformity caused by taking the modulus in the next step is entirely negligible.
  2. Use pivot to find another index in the list of validators, flip, which is pivot - index accounting for wrap-around in the list.
  3. Calculate a single pseudo-random bit based on the seed, the current round number, and some bytes from either index or flip depending on which is greater.
  4. If our bit is zero, we keep index unchanged; if it is one, we set index to flip.

We are effectively swapping cards in a deck based on a deterministic algorithm.

The way that position is broken down is worth noting:

  • Bits 0-2 (3 bits) are used to select a single bit from the eight bits of byte.
  • Bits 3-7 (5 bits) are used to select a single byte from the thirty-two bytes of source.
  • Bits 8-39 (32 bits) are used in generating source. Note that the upper two bytes of this will always be zero in practice, due to limits on the number of active validators.

SHUFFLE_ROUND_COUNT is, and always has been, 90 in the mainnet configuration, as explained there.

See the section on Shuffling for a more structured exposition and analysis of this algorithm (with diagrams!).

In practice, full beacon node implementations will run this once per epoch using an optimised version that shuffles the whole list, and cache the result of that for the epoch.

Used by compute_committee(), compute_proposer_index(), get_next_sync_committee_indices()
Uses bytes_to_uint64()
See also SHUFFLE_ROUND_COUNT

compute_proposer_index

def compute_proposer_index(state: BeaconState, indices: Sequence[ValidatorIndex], seed: Bytes32) -> ValidatorIndex:
    """
    Return from ``indices`` a random index sampled by effective balance.
    """
    assert len(indices) > 0
    MAX_RANDOM_BYTE = 2**8 - 1
    i = uint64(0)
    total = uint64(len(indices))
    while True:
        candidate_index = indices[compute_shuffled_index(i % total, total, seed)]
        random_byte = hash(seed + uint_to_bytes(uint64(i // 32)))[i % 32]
        effective_balance = state.validators[candidate_index].effective_balance
        if effective_balance * MAX_RANDOM_BYTE >= MAX_EFFECTIVE_BALANCE * random_byte:
            return candidate_index
        i += 1

There is exactly one beacon block proposer per slot, selected randomly from among all the active validators. The seed parameter is set in get_beacon_proposer_index based on the epoch and slot. Note that there is a small but finite probability of the same validator being called on to propose a block more than once in an epoch.

A validator's chance of being the proposer is weighted by its effective balance: a validator with a 32 Ether effective balance is twice as likely to be chosen as a validator with a 16 Ether effective balance.

To account for the need to weight by effective balance, this function implements as a try-and-increment algorithm. A counter i starts at zero. This counter does double duty:

  • First i is used to uniformly select a candidate proposer with probability 1/N1/N where, NN is the number of active validators. This is done by using the compute_shuffled_index routine to shuffle index i to a new location, which is then the candidate_index.
  • Then i is used to generate a pseudo-random byte using the hash function as a seeded PRNG with at least 256 bits of output. The lower 5 bits of i select a byte in the hash function, and the upper bits salt the seed. (An obvious optimisation is that the output of the hash changes only once every 32 iterations.)

The if test is where the weighting by effective balance is done. If the candidate has MAX_EFFECTIVE_BALANCE, it will always pass this test and be returned as the proposer. If the candidate has a fraction of MAX_EFFECTIVE_BALANCE then that fraction is the probability of being returned as proposer.

If the candidate is not chosen, then i is incremented, and we try again. Since the minimum effective balance is half of the maximum, then this ought to terminate fairly swiftly. In the worst case, all validators have 16 Ether effective balance, so the chance of having to do another iteration is 50%, in which case there is a one in a million chance of having to do 20 iterations.

Note that this dependence on the validators' effective balances, which are updated at the end of each epoch, means that proposer assignments are valid only in the current epoch. This is different from attestation committee assignments, which are valid with a one epoch look-ahead.

Used by get_beacon_proposer_index()
Uses compute_shuffled_index()
See also MAX_EFFECTIVE_BALANCE

compute_committee

def compute_committee(indices: Sequence[ValidatorIndex],
                      seed: Bytes32,
                      index: uint64,
                      count: uint64) -> Sequence[ValidatorIndex]:
    """
    Return the committee corresponding to ``indices``, ``seed``, ``index``, and committee ``count``.
    """
    start = (len(indices) * index) // count
    end = (len(indices) * uint64(index + 1)) // count
    return [indices[compute_shuffled_index(uint64(i), uint64(len(indices)), seed)] for i in range(start, end)]

compute_committee is used by get_beacon_committee() to find the specific members of one of the committees at a slot.

Every epoch, a fresh set of committees is generated; during an epoch, the committees are stable.

Looking at the parameters in reverse order:

  • count is the total number of committees in an epoch. This is SLOTS_PER_EPOCH times the output of get_committee_count_per_slot().
  • index is the committee number within the epoch, running from 0 to count - 1. It is calculated in get_beacon_committee() from the committee number in the slot index and the slot number as (slot % SLOTS_PER_EPOCH) * committees_per_slot + index.
  • seed is the seed value for computing the pseudo-random shuffling, based on the epoch number and a domain parameter. (get_beacon_committee() uses DOMAIN_BEACON_ATTESTER.)
  • indices is the list of validators eligible for inclusion in committees, namely the whole list of indices of active validators.

Random sampling among the validators is done by taking a contiguous slice of array indices from start to end and seeing where each one gets shuffled to by compute_shuffled_index(). Note that ValidatorIndex(i) is a type-cast in the above: it just turns i into a ValidatorIndex type for input into the shuffling. The output value of the shuffling is then used as an index into the indices list. There is much here that client implementations will optimise with caching and batch operations.

It may not be immediately obvious, but not all committees returned will be the same size (they can vary by one), and every validator in indices will be a member of exactly one committee. As we increment index from zero, clearly start for index == j + 1 is end for index == j, so there are no gaps. In addition, the highest index is count - 1, so every validator in indices finds its way into a committee.10

This method of selecting committees is light client friendly. Light clients can compute only the committees that they are interested in without needing to deal with the entire validator set. See the section on Shuffling for explanation of how this works.

Sync committees are assigned by a different process that is more akin to repeatedly performing compute_proposer_index().

Used by get_beacon_committee
Uses compute_shuffled_index()

compute_epoch_at_slot

def compute_epoch_at_slot(slot: Slot) -> Epoch:
    """
    Return the epoch number at ``slot``.
    """
    return Epoch(slot // SLOTS_PER_EPOCH)

This is trivial enough that I won't explain it. But note that it does rely on GENESIS_SLOT and GENESIS_EPOCH being zero. The more pernickety among us might prefer it to read,

    return GENESIS_EPOCH + Epoch((slot - GENESIS_SLOT) // SLOTS_PER_EPOCH)

compute_start_slot_at_epoch

def compute_start_slot_at_epoch(epoch: Epoch) -> Slot:
    """
    Return the start slot of ``epoch``.
    """
    return Slot(epoch * SLOTS_PER_EPOCH)

Maybe should read,

    return GENESIS_SLOT + Slot((epoch - GENESIS_EPOCH) * SLOTS_PER_EPOCH))
Used by get_block_root(), compute_slots_since_epoch_start()
See also SLOTS_PER_EPOCH, GENESIS_SLOT, GENESIS_EPOCH

compute_activation_exit_epoch

def compute_activation_exit_epoch(epoch: Epoch) -> Epoch:
    """
    Return the epoch during which validator activations and exits initiated in ``epoch`` take effect.
    """
    return Epoch(epoch + 1 + MAX_SEED_LOOKAHEAD)

When queuing validators for activation or exit in process_registry_updates() and initiate_validator_exit() respectively, the activation or exit is delayed until the next epoch, plus MAX_SEED_LOOKAHEAD epochs, currently 4.

See MAX_SEED_LOOKAHEAD for the details, but in short it is designed to make it extremely hard for an attacker to manipulate the membership of committees via activations and exits.

Used by initiate_validator_exit(), process_registry_updates()
See also MAX_SEED_LOOKAHEAD

compute_fork_data_root

def compute_fork_data_root(current_version: Version, genesis_validators_root: Root) -> Root:
    """
    Return the 32-byte fork data root for the ``current_version`` and ``genesis_validators_root``.
    This is used primarily in signature domains to avoid collisions across forks/chains.
    """
    return hash_tree_root(ForkData(
        current_version=current_version,
        genesis_validators_root=genesis_validators_root,
    ))

The fork data root serves as a unique identifier for the chain that we are on. genesis_validators_root identifies our unique genesis event, and current_version our own hard fork subsequent to that genesis event. This is useful, for example, to differentiate between a testnet and mainnet: both might have the same fork versions, but will definitely have different genesis validator roots.

It is used by compute_fork_digest() and compute_domain().

Used by compute_fork_digest(), compute_domain()
Uses hash_tree_root()
See also ForkData

compute_fork_digest

def compute_fork_digest(current_version: Version, genesis_validators_root: Root) -> ForkDigest:
    """
    Return the 4-byte fork digest for the ``current_version`` and ``genesis_validators_root``.
    This is a digest primarily used for domain separation on the p2p layer.
    4-bytes suffices for practical separation of forks/chains.
    """
    return ForkDigest(compute_fork_data_root(current_version, genesis_validators_root)[:4])

Extracts the first four bytes of the fork data root as a ForkDigest type. It is primarily used for domain separation on the peer-to-peer networking layer.

compute_fork_digest() is used extensively in the Ethereum 2.0 networking specification to distinguish between independent beacon chain networks or forks: it is important that activity on one chain does not interfere with other chains.

Uses compute_fork_data_root()
See also ForkDigest

compute_domain

def compute_domain(domain_type: DomainType, fork_version: Version=None, genesis_validators_root: Root=None) -> Domain:
    """
    Return the domain for the ``domain_type`` and ``fork_version``.
    """
    if fork_version is None:
        fork_version = GENESIS_FORK_VERSION
    if genesis_validators_root is None:
        genesis_validators_root = Root()  # all bytes zero by default
    fork_data_root = compute_fork_data_root(fork_version, genesis_validators_root)
    return Domain(domain_type + fork_data_root[:28])

When dealing with signed messages, the signature "domains" are separated according to three independent factors:

  1. All signatures include a DomainType relevant to the message's purpose, which is just some cryptographic hygiene in case the same message is to be signed for different purposes at any point.
  2. All but signatures on deposit messages include the fork version. This ensures that messages across different forks of the chain become invalid, and that validators won't be slashed for signing attestations on two different chains (this is allowed).
  3. And, now, the root hash of the validator Merkle tree at Genesis is included. Along with the fork version this gives a unique identifier for our chain.

This function is mainly used by get_domain(). It is also used in deposit processing, in which case fork_version and genesis_validators_root take their default values since deposits are valid across forks.

Fun fact: this function looks pretty simple, but I found a subtle bug in the way tests were generated in a previous implementation.

Used by get_domain(), process_deposit()
Uses compute_fork_data_root()
See also Domain, DomainType GENESIS_FORK_VERSION

compute_signing_root

def compute_signing_root(ssz_object: SSZObject, domain: Domain) -> Root:
    """
    Return the signing root for the corresponding signing data.
    """
    return hash_tree_root(SigningData(
        object_root=hash_tree_root(ssz_object),
        domain=domain,
    ))

This is a pre-processor for signing objects with BLS signatures:

  1. calculate the hash tree root of the object;
  2. combine the hash tree root with the Domain inside a temporary SigningData object;
  3. return the hash tree root of that, which is the data to be signed.

The domain is usually the output of get_domain(), which mixes in the cryptographic domain, the fork version, and the genesis validators root to the message hash. For deposits, it is the output of compute_domain(), ignoring the fork version and genesis validators root.

This is exactly equivalent to adding the domain to an object and taking the hash tree root of the whole thing. Indeed, this function used to be called compute_domain_wrapper_root().

Used by Many places
Uses hash_tree_root()
See also SigningData, Domain

compute_timestamp_at_slot

Note: This function is unsafe with respect to overflows and underflows.

def compute_timestamp_at_slot(state: BeaconState, slot: Slot) -> uint64:
    slots_since_genesis = slot - GENESIS_SLOT
    return uint64(state.genesis_time + slots_since_genesis * SECONDS_PER_SLOT)

A simple utility for calculating the Unix timestamp at the start of the given slot. This is used when validating execution payloads.

This function was added in the Bellatrix pre-Merge upgrade.

Used by process_execution_payload()

Participation flags

These two simple utilities were added in the Altair upgrade.

add_flag

def add_flag(flags: ParticipationFlags, flag_index: int) -> ParticipationFlags:
    """
    Return a new ``ParticipationFlags`` adding ``flag_index`` to ``flags``.
    """
    flag = ParticipationFlags(2**flag_index)
    return flags | flag

This is simple and self-explanatory. The 2**flag_index is a bit Pythonic. In a C-like language it would use a bit-shift:

    1 << flag_index
Used by process_attestation()
See also ParticipationFlags

has_flag

def has_flag(flags: ParticipationFlags, flag_index: int) -> bool:
    """
    Return whether ``flags`` has ``flag_index`` set.
    """
    flag = ParticipationFlags(2**flag_index)
    return flags & flag == flag

Move along now, nothing to see here.

Used by get_unslashed_participating_indices(), process_attestation()
See also ParticipationFlags

Beacon State Accessors

As the name suggests, these functions access the beacon state to calculate various useful things, without modifying it.

get_current_epoch

def get_current_epoch(state: BeaconState) -> Epoch:
    """
    Return the current epoch.
    """
    return compute_epoch_at_slot(state.slot)

A getter for the current epoch, as calculated by compute_epoch_at_slot().

Used by Everywhere
Uses compute_epoch_at_slot()

get_previous_epoch

def get_previous_epoch(state: BeaconState) -> Epoch:
    """`
    Return the previous epoch (unless the current epoch is ``GENESIS_EPOCH``).
    """
    current_epoch = get_current_epoch(state)
    return GENESIS_EPOCH if current_epoch == GENESIS_EPOCH else Epoch(current_epoch - 1)

Return the previous epoch number as an Epoch type. Returns GENESIS_EPOCH if we are in the GENESIS_EPOCH, since it has no prior, and we don't do negative numbers.

Used by Everywhere
Uses get_current_epoch()
See also GENESIS_EPOCH

get_block_root

def get_block_root(state: BeaconState, epoch: Epoch) -> Root:
    """
    Return the block root at the start of a recent ``epoch``.
    """
    return get_block_root_at_slot(state, compute_start_slot_at_epoch(epoch))

The Casper FFG part of consensus deals in Checkpoints that are the first slot of an epoch. get_block_root is a specialised version of get_block_root_at_slot() that returns the block root of the checkpoint, given only an epoch.

Used by get_attestation_participation_flag_indices(), weigh_justification_and_finalization()
Uses get_block_root_at_slot(), compute_start_slot_at_epoch()
See also Root

get_block_root_at_slot

def get_block_root_at_slot(state: BeaconState, slot: Slot) -> Root:
    """
    Return the block root at a recent ``slot``.
    """
    assert slot < state.slot <= slot + SLOTS_PER_HISTORICAL_ROOT
    return state.block_roots[slot % SLOTS_PER_HISTORICAL_ROOT]

Recent block roots are stored in a circular list in state, with a length of SLOTS_PER_HISTORICAL_ROOT (currently ~27 hours).

get_block_root_at_slot() is used by get_attestation_participation_flag_indices() to check whether an attestation has voted for the correct chain head. It is also used in process_sync_aggregate() to find the block that the sync committee is signing-off on.

Used by get_block_root(), get_attestation_participation_flag_indices(), process_sync_aggregate()
See also SLOTS_PER_HISTORICAL_ROOT, Root

get_randao_mix

def get_randao_mix(state: BeaconState, epoch: Epoch) -> Bytes32:
    """
    Return the randao mix at a recent ``epoch``.
    """
    return state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR]

RANDAO mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. They are used when calculating the seed for assigning beacon proposers and committees.

The RANDAO mix for the current epoch is updated on a block-by-block basis as new RANDAO reveals come in. The mixes for previous epochs are the frozen RANDAO values at the end of the epoch.

Used by get_seed, process_randao_mixes_reset(), process_randao()
See also EPOCHS_PER_HISTORICAL_VECTOR

get_active_validator_indices

def get_active_validator_indices(state: BeaconState, epoch: Epoch) -> Sequence[ValidatorIndex]:
    """
    Return the sequence of active validator indices at ``epoch``.
    """
    return [ValidatorIndex(i) for i, v in enumerate(state.validators) if is_active_validator(v, epoch)]

Steps through the entire list of validators and returns the list of only the active ones. That is, the list of validators that have been activated but not exited as determined by is_active_validator().

This function is heavily used, and I'd expect it to be memoised in practice.

Used by Many places
Uses is_active_validator()

get_validator_churn_limit

def get_validator_churn_limit(state: BeaconState) -> uint64:
    """
    Return the validator churn limit for the current epoch.
    """
    active_validator_indices = get_active_validator_indices(state, get_current_epoch(state))
    return max(MIN_PER_EPOCH_CHURN_LIMIT, uint64(len(active_validator_indices)) // CHURN_LIMIT_QUOTIENT)

The "churn limit" applies when activating and exiting validators and acts as a rate-limit on changes to the validator set. The value returned by this function provides the number of validators that may become active in an epoch, and the number of validators that may exit in an epoch.

Some small amount of churn is always allowed, set by MIN_PER_EPOCH_CHURN_LIMIT, and the amount of per-epoch churn allowed increases by one for every extra CHURN_LIMIT_QUOTIENT validators that are currently active (once the minimum has been exceeded).

In concrete terms, with 500,000 validators, this means that up to seven validators can enter or exit the active validator set each epoch (1,575 per day). At 524,288 active validators the limit will rise to eight per epoch (1,800 per day).

Used by initiate_validator_exit(), process_registry_updates()
Uses get_active_validator_indices()
See also MIN_PER_EPOCH_CHURN_LIMIT, CHURN_LIMIT_QUOTIENT

get_seed

def get_seed(state: BeaconState, epoch: Epoch, domain_type: DomainType) -> Bytes32:
    """
    Return the seed at ``epoch``.
    """
    mix = get_randao_mix(state, Epoch(epoch + EPOCHS_PER_HISTORICAL_VECTOR - MIN_SEED_LOOKAHEAD - 1))  # Avoid underflow
    return hash(domain_type + uint_to_bytes(epoch) + mix)

Used in get_beacon_committee(), get_beacon_proposer_index(), and get_next_sync_committee_indices() to provide the randomness for computing proposers and committees. domain_type is DOMAIN_BEACON_ATTESTER, DOMAIN_BEACON_PROPOSER, and DOMAIN_SYNC_COMMITTEE respectively.

RANDAO mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. The seed for an epoch is based on the randao mix from MIN_SEED_LOOKAHEAD epochs ago. This is to limit the forward visibility of randomness: see the explanation there.

The seed returned is not based only on the domain and the randao mix, but the epoch number is also mixed in. This is to handle the pathological case of no blocks being seen for more than two epochs, in which case we run out of randao updates. That could lock in forever a non-participating set of block proposers. Mixing in the epoch number means that fresh committees and proposers can continue to be selected.

Used by get_beacon_committee(), get_beacon_proposer_index(), get_next_sync_committee_indices()
Uses get_randao_mix()
See also EPOCHS_PER_HISTORICAL_VECTOR, MIN_SEED_LOOKAHEAD

get_committee_count_per_slot

def get_committee_count_per_slot(state: BeaconState, epoch: Epoch) -> uint64:
    """
    Return the number of committees in each slot for the given ``epoch``.
    """
    return max(uint64(1), min(
        MAX_COMMITTEES_PER_SLOT,
        uint64(len(get_active_validator_indices(state, epoch))) // SLOTS_PER_EPOCH // TARGET_COMMITTEE_SIZE,
    ))

Every slot in a given epoch has the same number of beacon committees, as calculated by this function.

As far as the LMD GHOST consensus protocol is concerned, all the validators attesting in a slot effectively act as a single large committee. However, organising them into multiple committees gives two benefits.

  1. Having multiple smaller committees reduces the load on the aggregators that collect and aggregate the attestations from committee members. This is important, as validating the signatures and aggregating them takes time. The downside is that blocks need to be larger, as, in the best case, there are up to 64 aggregate attestations to store per block rather than a single large aggregate signature over all attestations.
  2. It maps well onto the future plans for data shards, when each committee will be responsible for committing to a block on one shard in addition to its current duties.

Since the original Phase 1 sharding design that required these committees has now been abandoned, the second of these points no longer applies.

There is always at least one committee per slot, and never more than MAX_COMMITTEES_PER_SLOT, currently 64.

Subject to these constraints, the actual number of committees per slot is N/4096N / 4096, where NN is the total number of active validators.

The intended behaviour looks like this:

  • The ideal case is that there are MAX_COMMITTEES_PER_SLOT = 64 committees per slot. This maps to one committee per slot per shard once data sharding has been implemented. These committees will be responsible for voting on shard crosslinks. There must be at least 262,144 active validators to achieve this.
  • If there are fewer active validators, then the number of committees per shard is reduced below 64 in order to maintain a minimum committee size of TARGET_COMMITTEE_SIZE = 128. In this case, not every shard will get crosslinked at every slot (once sharding is in place).
  • Finally, only if the number of active validators falls below 4096 will the committee size be reduced to less than 128. With so few validators, the chain has no meaningful security in any case.
Used by get_beacon_committee(), process_attestation()
Uses get_active_validator_indices()
See also MAX_COMMITTEES_PER_SLOT, TARGET_COMMITTEE_SIZE

get_beacon_committee

def get_beacon_committee(state: BeaconState, slot: Slot, index: CommitteeIndex) -> Sequence[ValidatorIndex]:
    """
    Return the beacon committee at ``slot`` for ``index``.
    """
    epoch = compute_epoch_at_slot(slot)
    committees_per_slot = get_committee_count_per_slot(state, epoch)
    return compute_committee(
        indices=get_active_validator_indices(state, epoch),
        seed=get_seed(state, epoch, DOMAIN_BEACON_ATTESTER),
        index=(slot % SLOTS_PER_EPOCH) * committees_per_slot + index,
        count=committees_per_slot * SLOTS_PER_EPOCH,
    )

Beacon committees vote on the beacon block at each slot via attestations. There are up to MAX_COMMITTEES_PER_SLOT beacon committees per slot, and each committee is active exactly once per epoch.

This function returns the list of committee members given a slot number and an index within that slot to select the desired committee, relying on compute_committee() to do the heavy lifting.

Note that, since this uses get_seed(), we can obtain committees only up to EPOCHS_PER_HISTORICAL_VECTOR epochs into the past (minus MIN_SEED_LOOKAHEAD).

get_beacon_committee is used by get_attesting_indices() and process_attestation() when processing attestations coming from a committee, and by validators when checking their committee assignments and aggregation duties.

Used by get_attesting_indices(), process_attestation()
Uses get_committee_count_per_slot(), compute_committee(), get_active_validator_indices(), get_seed()
See also MAX_COMMITTEES_PER_SLOT, DOMAIN_BEACON_ATTESTER

get_beacon_proposer_index

def get_beacon_proposer_index(state: BeaconState) -> ValidatorIndex:
    """
    Return the beacon proposer index at the current slot.
    """
    epoch = get_current_epoch(state)
    seed = hash(get_seed(state, epoch, DOMAIN_BEACON_PROPOSER) + uint_to_bytes(state.slot))
    indices = get_active_validator_indices(state, epoch)
    return compute_proposer_index(state, indices, seed)

Each slot, exactly one of the active validators is randomly chosen to be the proposer of the beacon block for that slot. The probability of being selected is weighted by the validator's effective balance in compute_proposer_index().

The chosen block proposer does not need to be a member of one of the beacon committees for that slot: it is chosen from the entire set of active validators for that epoch.

The RANDAO seed returned by get_seed() is updated once per epoch. The slot number is mixed into the seed using a hash to allow us to choose a different proposer at each slot. This also protects us in the case that there is an entire epoch of empty blocks. If that were to happen the RANDAO would not be updated, but we would still be able to select a different set of proposers for the next epoch via this slot number mix-in process.

There is a chance of the same proposer being selected in two consecutive slots, or more than once per epoch. If every validator has the same effective balance, then the probability of being selected in a particular slot is simply 1N\frac{1}{N} independent of any other slot, where NN is the number of active validators in the epoch corresponding to the slot.

Currently, neither get_beacon_proposer_index() nor compute_proposer_index() filter out slashed validators. This could result in a slashed validator, prior to its exit, being selected to propose a block. Its block would, however, be invalid due to the check in process_block_header(). A fix for this has been proposed so as to avoid many missed slots (slots with invalid blocks) in the event of a mass slashing.

Used by slash_validator(), process_block_header(), process_randao(), process_attestation(), process_sync_aggregate()
Uses get_seed(), uint_to_bytes(), get_active_validator_indices(), compute_proposer_index()

get_total_balance

def get_total_balance(state: BeaconState, indices: Set[ValidatorIndex]) -> Gwei:
    """
    Return the combined effective balance of the ``indices``.
    ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    Math safe up to ~10B ETH, after which this overflows uint64.
    """
    return Gwei(max(EFFECTIVE_BALANCE_INCREMENT, sum([state.validators[index].effective_balance for index in indices])))

A simple utility that returns the total balance of all validators in the list, indices, passed in.

As an aside, there is an interesting example of some fragility in the spec lurking here. This function used to return a minimum of 1 Gwei to avoid a potential division by zero in the calculation of rewards and penalties. However, the rewards calculation was modified to avoid a possible integer overflow condition, without modifying this function, which re-introduced the possibility of a division by zero. This was later fixed by returning a minimum of EFFECTIVE_BALANCE_INCREMENT. The formal verification of the specification is helpful in avoiding issues like this.

Used by get_total_active_balance(), get_flag_index_deltas(), process_justification_and_finalization()
See also EFFECTIVE_BALANCE_INCREMENT

get_total_active_balance

def get_total_active_balance(state: BeaconState) -> Gwei:
    """
    Return the combined effective balance of the active validators.
    Note: ``get_total_balance`` returns ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    """
    return get_total_balance(state, set(get_active_validator_indices(state, get_current_epoch(state))))

Uses get_total_balance() to calculate the sum of the effective balances of all active validators in the current epoch.

This quantity is frequently used in the spec. For example, Casper FFG uses the total active balance to judge whether the 2/3 majority threshold of attestations has been reached in justification and finalisation. And it is a fundamental part of the calculation of rewards and penalties. The base reward is proportional to the reciprocal of the square root of the total active balance. Thus, validator rewards are higher when little balance is at stake (few active validators) and lower when much balance is at stake (many active validators).

Since it is calculated from effective balances, total active balance does not change during an epoch, so is a great candidate for being cached.

Used by get_flag_index_deltas(), process_justification_and_finalization(), get_base_reward_per_increment(), process_slashings(), process_sync_aggregate()
Uses get_total_balance(), get_active_validator_indices()

get_domain

def get_domain(state: BeaconState, domain_type: DomainType, epoch: Epoch=None) -> Domain:
    """
    Return the signature domain (fork version concatenated with domain type) of a message.
    """
    epoch = get_current_epoch(state) if epoch is None else epoch
    fork_version = state.fork.previous_version if epoch < state.fork.epoch else state.fork.current_version
    return compute_domain(domain_type, fork_version, state.genesis_validators_root)

get_domain() pops up whenever signatures need to be verified, since a DomainType is always mixed in to the signed data. For the science behind domains, see Domain types and compute_domain().

Except for DOMAIN_DEPOSIT, domains are always combined with the fork version before being used in signature generation. This is to distinguish messages from different chains, and ensure that validators don't get slashed if they choose to participate on two independent forks. (That is, deliberate forks, aka hard-forks. Participating on both branches of temporary consensus forks is punishable: that's basically the whole point of slashing.)

Note that a message signed under one fork version will be valid during the next fork version, but not thereafter. So, for example, voluntary exit messages signed during Altair will be valid after the Bellatrix beacon chain upgrade, but not after the Capella upgrade. Voluntary exit messages signed during Phase 0 are valid under Altair but were made invalid by the Bellatrix upgrade11.

Used by is_valid_indexed_attestation(), verify_block_signature(), process_randao(), process_proposer_slashing(), process_voluntary_exit(), process_sync_aggregate()
Uses compute_domain()
See also DomainType, Domain types

get_indexed_attestation

def get_indexed_attestation(state: BeaconState, attestation: Attestation) -> IndexedAttestation:
    """
    Return the indexed attestation corresponding to ``attestation``.
    """
    attesting_indices = get_attesting_indices(state, attestation.data, attestation.aggregation_bits)

    return IndexedAttestation(
        attesting_indices=sorted(attesting_indices),
        data=attestation.data,
        signature=attestation.signature,
    )

Lists of validators within committees occur in two forms in the specification.

  • They can be compressed into a bitlist, in which each bit represents the presence or absence of a validator from a particular committee. The committee is referenced by slot, and committee index within that slot. This is how sets of validators are represented in Attestations.
  • Or they can be listed explicitly by their validator indices, as in IndexedAttestations. Note that the list of indices is sorted: an attestation is invalid if not.

get_indexed_attestation() converts from the former representation to the latter. The slot number and the committee index are provided by the AttestationData and are used to reconstruct the committee members via get_beacon_committee(). The supplied bitlist will have come from an Attestation.

Attestations are aggregatable, which means that attestations from multiple validators making the same vote can be rolled up into a single attestation through the magic of BLS signature aggregation. However, in order to be able to verify the signature later, a record needs to be kept of which validators actually contributed to the attestation. This is so that those validators' public keys can be aggregated to match the construction of the signature.

The conversion from the bit-list format to the list format is performed by get_attesting_indices(), below.

Used by process_attestation()
Uses get_attesting_indices()
See also Attestation, IndexedAttestation

get_attesting_indices

def get_attesting_indices(state: BeaconState,
                          data: AttestationData,
                          bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]) -> Set[ValidatorIndex]:
    """
    Return the set of attesting indices corresponding to ``data`` and ``bits``.
    """
    committee = get_beacon_committee(state, data.slot, data.index)
    return set(index for i, index in enumerate(committee) if bits[i])

As described under get_indexed_attestation(), lists of validators come in two forms. This routine converts from the compressed form, in which validators are represented as a subset of a committee with their presence or absence indicated by a 1 bit or a 0 bit respectively, to an explicit list of ValidatorIndex types.

Used by get_indexed_attestation(), process_attestation()
Uses get_beacon_committee()
See also AttestationData, IndexedAttestation

get_next_sync_committee_indices

def get_next_sync_committee_indices(state: BeaconState) -> Sequence[ValidatorIndex]:
    """
    Return the sync committee indices, with possible duplicates, for the next sync committee.
    """
    epoch = Epoch(get_current_epoch(state) + 1)

    MAX_RANDOM_BYTE = 2**8 - 1
    active_validator_indices = get_active_validator_indices(state, epoch)
    active_validator_count = uint64(len(active_validator_indices))
    seed = get_seed(state, epoch, DOMAIN_SYNC_COMMITTEE)
    i = 0
    sync_committee_indices: List[ValidatorIndex] = []
    while len(sync_committee_indices) < SYNC_COMMITTEE_SIZE:
        shuffled_index = compute_shuffled_index(uint64(i % active_validator_count), active_validator_count, seed)
        candidate_index = active_validator_indices[shuffled_index]
        random_byte = hash(seed + uint_to_bytes(uint64(i // 32)))[i % 32]
        effective_balance = state.validators[candidate_index].effective_balance
        if effective_balance * MAX_RANDOM_BYTE >= MAX_EFFECTIVE_BALANCE * random_byte:
            sync_committee_indices.append(candidate_index)
        i += 1
    return sync_committee_indices

get_next_sync_committee_indices() is used to select the subset of validators that will make up a sync committee. The committee size is SYNC_COMMITTEE_SIZE, and the committee is allowed to contain duplicates, that is, the same validator more than once. This is to handle gracefully the situation of there being fewer active validators than SYNC_COMMITTEE_SIZE.

Similarly to being chosen to propose a block, the probability of any validator being selected for a sync committee is proportional to its effective balance. Thus, the algorithm is almost the same as that of compute_proposer_index(), except that this one exits only after finding SYNC_COMMITTEE_SIZE members, rather than exiting as soon as a candidate is found. Both routines use the try-and-increment method to weight the probability of selection with the validators' effective balances.

It's fairly clear why block proposers are selected with a probability proportional to their effective balances: block production is subject to slashing, and proposers with less at stake have less to slash, so we reduce their influence accordingly. It is not so clear why the probability of being in a sync committee is also proportional to a validator's effective balance; sync committees are not subject to slashing. It has to do with keeping calculations for light clients simple. We don't want to burden light clients with summing up validators' balances to judge whether a 2/3 supermajority of stake in the committee has voted for a block. Ideally, they can just count the participation flags. To make this somewhat reliable, we weight the probability that a validator participates in proportion to its effective balance.

Used by get_next_sync_committee()
Uses get_active_validator_indices(), get_seed(), compute_shuffled_index(), uint_to_bytes()
See also SYNC_COMMITTEE_SIZE, compute_proposer_index()

get_next_sync_committee

Note: The function get_next_sync_committee should only be called at sync committee period boundaries and when upgrading state to Altair.

The random seed that generates the sync committee is based on the number of the next epoch. get_next_sync_committee_indices() doesn't contain any check that the epoch corresponds to a sync-committee change boundary, which allowed the timing of the Altair upgrade to be more flexible. But a consequence is that you will get an incorrect committee if you call get_next_sync_committee() at the wrong time.

def get_next_sync_committee(state: BeaconState) -> SyncCommittee:
    """
    Return the next sync committee, with possible pubkey duplicates.
    """
    indices = get_next_sync_committee_indices(state)
    pubkeys = [state.validators[index].pubkey for index in indices]
    aggregate_pubkey = eth_aggregate_pubkeys(pubkeys)
    return SyncCommittee(pubkeys=pubkeys, aggregate_pubkey=aggregate_pubkey)

get_next_sync_committee() is a simple wrapper around get_next_sync_committee_indices() that packages everything up into a nice SyncCommittee object.

See the SyncCommittee type for an explanation of how the aggregate_pubkey is intended to be used.

Used by process_sync_committee_updates(), initialize_beacon_state_from_eth1()
Uses get_next_sync_committee_indices(), eth_aggregate_pubkeys()
See also SyncCommittee

get_unslashed_participating_indices

def get_unslashed_participating_indices(state: BeaconState, flag_index: int, epoch: Epoch) -> Set[ValidatorIndex]:
    """
    Return the set of validator indices that are both active and unslashed for the given ``flag_index`` and ``epoch``.
    """
    assert epoch in (get_previous_epoch(state), get_current_epoch(state))
    if epoch == get_current_epoch(state):
        epoch_participation = state.current_epoch_participation
    else:
        epoch_participation = state.previous_epoch_participation
    active_validator_indices = get_active_validator_indices(state, epoch)
    participating_indices = [i for i in active_validator_indices if has_flag(epoch_participation[i], flag_index)]
    return set(filter(lambda index: not state.validators[index].slashed, participating_indices))

get_unslashed_participating_indices() returns the list of validators that made a timely attestation with the type flag_index during the epoch in question.

It is used with the TIMELY_TARGET_FLAG_INDEX flag in process_justification_and_finalization() to calculate the proportion of stake that voted for the candidate checkpoint in the current and previous epochs.

It is also used with the TIMELY_TARGET_FLAG_INDEX for applying inactivity penalties in process_inactivity_updates() and get_inactivity_penalty_deltas(). If a validator misses a correct target vote during an inactivity leak then it is considered not to have participated at all (it is not contributing anything useful).

And it is used in get_flag_index_deltas() for calculating rewards due for each type of correct vote.

Slashed validators are ignored. Once slashed, validators no longer receive rewards or participate in consensus, although they are subject to penalties until they have finally been exited.

Used by get_flag_index_deltas(), process_justification_and_finalization(), process_inactivity_updates(), get_inactivity_penalty_deltas()
Uses get_active_validator_indices(), has_flag()
See also Participation flag indices

get_attestation_participation_flag_indices

def get_attestation_participation_flag_indices(state: BeaconState,
                                               data: AttestationData,
                                               inclusion_delay: uint64) -> Sequence[int]:
    """
    Return the flag indices that are satisfied by an attestation.
    """
    if data.target.epoch == get_current_epoch(state):
        justified_checkpoint = state.current_justified_checkpoint
    else:
        justified_checkpoint = state.previous_justified_checkpoint

    # Matching roots
    is_matching_source = data.source == justified_checkpoint
    is_matching_target = is_matching_source and data.target.root == get_block_root(state, data.target.epoch)
    is_matching_head = is_matching_target and data.beacon_block_root == get_block_root_at_slot(state, data.slot)
    assert is_matching_source

    participation_flag_indices = []
    if is_matching_source and inclusion_delay <= integer_squareroot(SLOTS_PER_EPOCH):
        participation_flag_indices.append(TIMELY_SOURCE_FLAG_INDEX)
    if is_matching_target and inclusion_delay <= SLOTS_PER_EPOCH:
        participation_flag_indices.append(TIMELY_TARGET_FLAG_INDEX)
    if is_matching_head and inclusion_delay == MIN_ATTESTATION_INCLUSION_DELAY:
        participation_flag_indices.append(TIMELY_HEAD_FLAG_INDEX)

    return participation_flag_indices

This is called by process_attestation() during block processing, and is the heart of the mechanism for recording validators' votes as contained in their attestations. It filters the given attestation against the beacon state's current view of the chain, and returns participation flag indices only for the votes that are both correct and timely.

data is an AttestationData object that contains the source, target, and head votes of the validators that contributed to the attestation. The attestation may represent the votes of one or more validators.

inclusion_delay is the difference between the current slot on the beacon chain and the slot for which the attestation was created. For the block containing the attestation to be valid, inclusion_delay must be between MIN_ATTESTATION_INCLUSION_DELAY and SLOTS_PER_EPOCH inclusive. In other words, attestations must be included in the next block, or in any block up to 32 slots later, after which they are ignored.

Since the attestation may be up to 32 slots old, it might have been generated in the current epoch or the previous epoch, so the first thing we do is to check the attestation's target vote epoch to see which epoch we should be looking at in the beacon state.

Next, we check whether each of the votes in the attestation are correct:

  • Does the attestation's source vote match what we believe to be the justified checkpoint in the epoch in question?
  • If so, does the attestation's target vote match the head block at the epoch's checkpoint, that is, the first slot of the epoch?
  • If so, does the attestation's head vote match what we believe to be the head block at the attestation's slot? Note that the slot may not contain a block – it may be a skip slot – in which case the last known block is considered to be the head.

These three build on each other, so that it is not possible to have a correct target vote without a correct source vote, and it is not possible to have a correct head vote without a correct target vote.

The assert statement is interesting. If an attestation does not have the correct source vote, the block containing it is invalid and is discarded. Having an incorrect source vote means that the block proposer disagrees with me about the last justified checkpoint, which is an irreconcilable difference.

After checking the validity of the votes, the timeliness of each vote is checked. Let's take them in reverse order.

  • Correct head votes must be included immediately, that is, in the very next slot.
    • Head votes, used for LMD GHOST consensus, are not useful after one slot.
  • Correct target votes must be included within 32 slots, one epoch.
    • Target votes are useful at any time, but it is simpler if they don't span more than a couple of epochs, so 32 slots is a reasonable limit. This check is actually redundant since attestations in blocks cannot be older than 32 slots.
  • Correct source votes must be included within 5 slots (integer_squareroot(32)).
    • This is the geometric mean of 1 (the timely head threshold) and 32 (the timely target threshold). This is an arbitrary choice. Vitalik's view12 is that, with this setting, the cumulative timeliness rewards most closely match an exponentially decreasing curve, which "feels more logical".

The timely inclusion requirements are new in Altair. In Phase 0, all correct votes received a reward, and there was an additional reward for inclusion the was proportional to the reciprocal of the inclusion distance. This led to an oddity where it was always more profitable to vote for a correct head, even if that meant waiting longer and risking not being included in the next slot.

Used by process_attestation()
Uses get_block_root(), get_block_root_at_slot(), integer_squareroot()
See also Participation flag indices, AttestationData, MIN_ATTESTATION_INCLUSION_DELAY

get_flag_index_deltas

def get_flag_index_deltas(state: BeaconState, flag_index: int) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return the deltas for a given ``flag_index`` by scanning through the participation flags.
    """
    rewards = [Gwei(0)] * len(state.validators)
    penalties = [Gwei(0)] * len(state.validators)
    previous_epoch = get_previous_epoch(state)
    unslashed_participating_indices = get_unslashed_participating_indices(state, flag_index, previous_epoch)
    weight = PARTICIPATION_FLAG_WEIGHTS[flag_index]
    unslashed_participating_balance = get_total_balance(state, unslashed_participating_indices)
    unslashed_participating_increments = unslashed_participating_balance // EFFECTIVE_BALANCE_INCREMENT
    active_increments = get_total_active_balance(state) // EFFECTIVE_BALANCE_INCREMENT
    for index in get_eligible_validator_indices(state):
        base_reward = get_base_reward(state, index)
        if index in unslashed_participating_indices:
            if not is_in_inactivity_leak(state):
                reward_numerator = base_reward * weight * unslashed_participating_increments
                rewards[index] += Gwei(reward_numerator // (active_increments * WEIGHT_DENOMINATOR))
        elif flag_index != TIMELY_HEAD_FLAG_INDEX:
            penalties[index] += Gwei(base_reward * weight // WEIGHT_DENOMINATOR)
    return rewards, penalties

This function is used during epoch processing to assign rewards and penalties to individual validators based on their voting record in the previous epoch. Rewards for block proposers for including attestations are calculated during block processing. The "deltas" in the function name are the separate lists of rewards and penalties returned. Rewards and penalties are always treated separately to avoid negative numbers.

The function is called once for each of the flag types corresponding to correct attestation votes: timely source, timely target, timely head.

The list of validators returned by get_unslashed_participating_indices() contains the ones that will be rewarded for making this vote type in a timely and correct manner. That routine uses the flags set in state for each validator by process_attestation() during block processing and returns the validators for which the corresponding flag is set.

Every active validator is expected to make an attestation exactly once per epoch, so we then cycle through the entire set of active validators, rewarding them if they appear in unslashed_participating_indices, as long as we are not in an inactivity leak. If we are in a leak, no validator is rewarded for any of its votes, but penalties still apply to non-participating validators.

Notice that the reward is weighted with unslashed_participating_increments, which is proportional to the total stake of the validators that made a correct vote with this flag. This means that, if participation by other validators is lower, then my rewards are lower, even if I perform my duties perfectly. The reason for this is to do with discouragement attacks (see also this nice explainer13). In short, with this mechanism, validators are incentivised to help each other out (e.g. by forwarding gossip messages, or aggregating attestations well) rather than to attack or censor one-another.

Validators that did not make a correct and timely vote are penalised with a full weighted base reward for each flag that they missed, except for missing the head vote. Head votes have only a single slot to get included, so a missing block in the next slot is sufficient to cause a miss, but is completely outside the attester's control. Thus, head votes are only rewarded, not penalised. This also allows perfectly performing validators to break even during an inactivity leak, when we expect at least a third of blocks to be missing: they receive no rewards, but ideally no penalties either.

Untangling the arithmetic, the maximum total issuance due to rewards for attesters in an epoch, IAI_A, comes out as follows, in the notation described later.

IA=Ws+Wt+WhWΣNBI_A = \frac{W_s + W_t + W_h}{W_{\Sigma}}NB
Used by process_rewards_and_penalties()
Uses get_unslashed_participating_indices(), get_total_balance(), get_total_active_balance(), get_eligible_validator_indices(), get_base_reward(), is_in_inactivity_leak()
See also process_attestation(), participation flag indices, rewards and penalties

Beacon State Mutators

increase_balance

def increase_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Increase the validator balance at index ``index`` by ``delta``.
    """
    state.balances[index] += delta

After creating a validator with its deposit balance, this and decrease_balance() are the only places in the spec where validator balances are ever modified.

We need two separate functions to change validator balances, one to increase them and one to decrease them, since we are using only unsigned integers.

Fun fact: A typo around this led to Teku's one and only consensus failure at the initial client interop event. Unsigned integers induce bugs!

Used by slash_validator(), process_rewards_and_penalties(), process_attestation(), process_deposit(), process_sync_aggregate()
See also decrease_balance()

decrease_balance

def decrease_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Decrease the validator balance at index ``index`` by ``delta``, with underflow protection.
    """
    state.balances[index] = 0 if delta > state.balances[index] else state.balances[index] - delta

The counterpart to increase_balance(). This has a little extra work to do to check for unsigned int underflow since balances may not go negative.

Used by slash_validator(), process_rewards_and_penalties(), process_slashings(), process_sync_aggregate()
See also increase_balance()

initiate_validator_exit

def initiate_validator_exit(state: BeaconState, index: ValidatorIndex) -> None:
    """
    Initiate the exit of the validator with index ``index``.
    """
    # Return if validator already initiated exit
    validator = state.validators[index]
    if validator.exit_epoch != FAR_FUTURE_EPOCH:
        return

    # Compute exit queue epoch
    exit_epochs = [v.exit_epoch for v in state.validators if v.exit_epoch != FAR_FUTURE_EPOCH]
    exit_queue_epoch = max(exit_epochs + [compute_activation_exit_epoch(get_current_epoch(state))])
    exit_queue_churn = len([v for v in state.validators if v.exit_epoch == exit_queue_epoch])
    if exit_queue_churn >= get_validator_churn_limit(state):
        exit_queue_epoch += Epoch(1)

    # Set validator exit epoch and withdrawable epoch
    validator.exit_epoch = exit_queue_epoch
    validator.withdrawable_epoch = Epoch(validator.exit_epoch + MIN_VALIDATOR_WITHDRAWABILITY_DELAY)

Exits may be initiated voluntarily, as a result of being slashed, or by dropping to the EJECTION_BALANCE threshold.

In all cases, a dynamic "churn limit" caps the number of validators that may exit per epoch. This is calculated by get_validator_churn_limit(). The mechanism for enforcing this is the exit queue: the validator's exit_epoch is set such that it is at the end of the queue.

The exit queue is not maintained as a separate data structure, but is continually re-calculated from the exit epochs of all validators and allowing for a fixed number to exit per epoch. I expect there are some optimisations to be had around this in actual implementations.

An exiting validator is expected to continue with its proposing and attesting duties until its exit_epoch has passed, and will continue to receive rewards and penalties accordingly.

In addition, an exited validator remains eligible to be slashed until its withdrawable_epoch, which is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after its exit_epoch. This is to allow some extra time for any slashable offences by the validator to be detected and reported.

Used by slash_validator(), process_registry_updates(), process_voluntary_exit()
Uses compute_activation_exit_epoch(), get_validator_churn_limit()
See also Voluntary Exits, MIN_VALIDATOR_WITHDRAWABILITY_DELAY

slash_validator

def slash_validator(state: BeaconState,
                    slashed_index: ValidatorIndex,
                    whistleblower_index: ValidatorIndex=None) -> None:
    """
    Slash the validator with index ``slashed_index``.
    """
    epoch = get_current_epoch(state)
    initiate_validator_exit(state, slashed_index)
    validator = state.validators[slashed_index]
    validator.slashed = True
    validator.withdrawable_epoch = max(validator.withdrawable_epoch, Epoch(epoch + EPOCHS_PER_SLASHINGS_VECTOR))
    state.slashings[epoch % EPOCHS_PER_SLASHINGS_VECTOR] += validator.effective_balance
    slashing_penalty = validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX
    decrease_balance(state, slashed_index, slashing_penalty)

    # Apply proposer and whistleblower rewards
    proposer_index = get_beacon_proposer_index(state)
    if whistleblower_index is None:
        whistleblower_index = proposer_index
    whistleblower_reward = Gwei(validator.effective_balance // WHISTLEBLOWER_REWARD_QUOTIENT)
    proposer_reward = Gwei(whistleblower_reward * PROPOSER_WEIGHT // WEIGHT_DENOMINATOR)
    increase_balance(state, proposer_index, proposer_reward)
    increase_balance(state, whistleblower_index, Gwei(whistleblower_reward - proposer_reward))

Both proposer slashings and attester slashings end up here when a report of a slashable offence has been verified during block processing.

When a validator is slashed, several things happen immediately:

  • The validator is processed for exit via initiate_validator_exit(), so it joins the exit queue.
  • The validator is marked as slashed. This information is used when calculating rewards and penalties: while being exited, whatever it does, a slashed validator receives penalties as if it had failed to propose or attest, including the inactivity leak if applicable.
  • Normally, as part of the exit process, the withdrawable_epoch for a validator (the point at which a validator's stake is in principle unlocked) is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after it exits. When a validator is slashed, a much longer period of lock-up applies, namely EPOCHS_PER_SLASHINGS_VECTOR. This is to allow a further, potentially much greater, slashing penalty to be applied later once the chain knows how many validators have been slashed together around the same time. The postponement of the withdrawable epoch is twice as long as required to apply the extra penalty, which is applied half-way through the period. This simply means that slashed validators continue to accrue attestation penalties for some 18 days longer than necessary. Treating slashed validators fairly is not a big priority for the protocol.
  • The effective balance of the validator is added to the accumulated effective balances of validators slashed this epoch, and stored in the circular list, state.slashings. This will later be used by the slashing penalty calculation mentioned in the previous point.
  • An initial "slap on the wrist" slashing penalty of the validator's effective balance (in Gwei) divided by the MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX is applied. For a validator with a full Effective Balance of 32 ETH, this initial penalty is 1 ETH.
  • The block proposer that included the slashing proof receives a reward.

In short, a slashed validator receives an initial minor penalty, can expect to receive a further penalty later, and is marked for exit.

Note that the whistleblower_index defaults to None in the parameter list. This is never used in Phase 0, with the result that the proposer that included the slashing gets the entire whistleblower reward; there is no separate whistleblower reward for the finder of proposer or attester slashings. One reason is simply that reports are too easy to steal: if I report a slashable event to a block proposer, there is nothing to prevent that proposer claiming the report as its own. We could introduce some fancy ZK protocol to make this trustless, but this is what we're going with for now. Later developments, such as the proof-of-custody game, may reward whistleblowers directly.

Used by process_proposer_slashing(), process_attester_slashing()
Uses initiate_validator_exit(), get_beacon_proposer_index(), decrease_balance(), increase_balance()
See also EPOCHS_PER_SLASHINGS_VECTOR, MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX, process_slashings()

Beacon Chain State Transition Function

Preamble

State transitions

The state transition function is at the heart of what blockchains do. Each node on the network maintains a state that corresponds to its view of the state of the world.

Classically, the node's state is updated by applying blocks, in order, with a "state transition function". The state transition function is "pure" in that its output depends only on the input, and it has no side effects. This makes it deterministic: if every node starts with the same state (the Genesis state), and applies the same sequence of blocks, then all nodes must end up with the same resulting state. If for some reason they don't, then we have a consensus failure.

If SS is a beacon state, and BB a beacon block, then the state transition function ff can be written

Sf(S,B)S' \equiv f(S, B)

In this equation we call SS the pre-state (the state before applying the block BB), and SS' the post-state. The function ff is then iterated as we receive new blocks to constantly update the state.

That's the essence of blockchain progress in its purest form, as it existed under proof of work; under proof of work, the state transition function is driven exclusively by processing blocks.

The beacon chain, however, is not block-driven. Rather, it is slot-driven. Updates to the state depends on the progress of slots, whether or not that slot has a block associated with it.

Thus, the beacon chain's state transition function comprises three elements.

  1. A per-slot transition function, Sfs(S)S' \equiv f_s(S). (The state contains the slot number, so we do not need to supply it.)
  2. A per-block transition function Sfb(S,B)S' \equiv f_b(S, B).
  3. A per-epoch transition function Sfe(S)S' \equiv f_e(S).

Each of these state transition functions needs to be run at the appropriate point when updating the chain, and it is the role of this part of the beacon chain specification to define all of this precisely.

Validity conditions

The post-state corresponding to a pre-state state and a signed block signed_block is defined as state_transition(state, signed_block). State transitions that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid. State transitions that cause a uint64 overflow or underflow are also considered invalid.

This is a very important statement of how the spec deals with invalid conditions and errors. Basically, if any block is processed that would trigger any kind of exception in the Python code of the specification, then that block is invalid and must be rejected. That means having to undo any state modifications already made in the course of processing the block.

People who do formal verification of the specification don't much like this, as having assert statements in running code is an anti-pattern: it is better to ensure that your code can simply never fail.

Specification

Anyway, as discussed above, the beacon chain state transition has three elements:

  1. slot processing, which is performed for every slot regardless of what else is happening;
  2. epoch processing, which happens every SLOTS_PER_EPOCH (32) slots, again regardless of whatever else is going on; and,
  3. block processing, which happens only in slots for which a beacon block has been received.

def state_transition(state: BeaconState, signed_block: SignedBeaconBlock, validate_result: bool=True) -> None:
    block = signed_block.message
    # Process slots (including those with no blocks) since block
    process_slots(state, block.slot)
    # Verify signature
    if validate_result:
        assert verify_block_signature(state, signed_block)
    # Process block
    process_block(state, block)
    # Verify state root
    if validate_result:
        assert block.state_root == hash_tree_root(state)

Although the beacon chain's state transition is conceptually slot-driven, as the spec is written a state transition is triggered by receiving a block to process. That means that we first need to fast-forward from our current slot number in the state (which is the slot at which we last processed a block) to the slot of the block we are processing. We treat intervening slots, if any, as empty. This "fast-forward" is done by process_slots(), which also triggers epoch processing as required.

In actual client implementations, state updates will usually be time-based, triggered by moving to the next slot if a block has not been received. However, the fast-forward functionality will be used when exploring different forks in the block tree.

The validate_result parameter defaults to True, meaning that the block's signature will be checked, and that the result of applying the block to the state results in the same state root that the block claims it does (the "post-states" must match). When creating blocks, however, proposers can set validate_result to False in order to allow the state root to be calculated, else we'd have a circular dependency. The signature over the initial candidate block is omitted to avoid bad interactions with slashing protection when signing twice in a slot.

Uses process_slots(), verify_block_signature, process_block

def verify_block_signature(state: BeaconState, signed_block: SignedBeaconBlock) -> bool:
    proposer = state.validators[signed_block.message.proposer_index]
    signing_root = compute_signing_root(signed_block.message, get_domain(state, DOMAIN_BEACON_PROPOSER))
    return bls.Verify(proposer.pubkey, signing_root, signed_block.signature)

Check that the signature on the block matches the block's contents and the public key of the claimed proposer of the block. This ensures that blocks cannot be forged, or tampered with in transit. All the public keys for validators are stored in the Validators list in state.

Used by state_transition()
Uses compute_signing_root(), get_domain(), bls.Verify()
See also DOMAIN_BEACON_PROPOSER

def process_slots(state: BeaconState, slot: Slot) -> None:
    assert state.slot < slot
    while state.slot < slot:
        process_slot(state)
        # Process epoch on the start slot of the next epoch
        if (state.slot + 1) % SLOTS_PER_EPOCH == 0:
            process_epoch(state)
        state.slot = Slot(state.slot + 1)

Updates the state from its current slot up to the given slot number assuming that all the intermediate slots are empty (that they do not contain blocks). Iteratively calls process_slot() to apply the empty slot state-transition.

This is where epoch processing is triggered when required. Empty slot processing is lightweight, but any epoch transitions that need to be processed require the full rewards and penalties, and justification–finalisation apparatus.

Used by state_transition()
Uses process_slot(), process_epoch()
See also SLOTS_PER_EPOCH

def process_slot(state: BeaconState) -> None:
    # Cache state root
    previous_state_root = hash_tree_root(state)
    state.state_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_state_root
    # Cache latest block header state root
    if state.latest_block_header.state_root == Bytes32():
        state.latest_block_header.state_root = previous_state_root
    # Cache block root
    previous_block_root = hash_tree_root(state.latest_block_header)
    state.block_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_block_root

Apply a single slot state-transition (but updating the slot number, and any required epoch processing is handled by process_slots()). This is done at each slot whether or not there is a block present; if there is no block present then it is the only thing that is done.

Slot processing is almost trivial and consists only of calculating the updated state and block hash tree roots (as necessary), and storing them in the historical lists in the state. In a circular way, the state roots only change over an empty slot state transition due to updating the lists of state and block roots.

SLOTS_PER_HISTORICAL_ROOT is a multiple of SLOTS_PER_EPOCH, so there is no danger of overwriting the circular lists of state_roots and block_roots. These will be dealt with correctly during epoch processing.

The only curiosity here is the lines,

    if state.latest_block_header.state_root == Bytes32():
        state.latest_block_header.state_root = previous_state_root

This logic was introduced to avoid a circular dependency while also keeping the state transition clean. Each block that we receive contains a post-state root, but as part of state processing we store the block in the state (in state.latest_block_header), thus changing the post-state root.

Therefore, to be able to verify the state transition, we use the convention that the state root of the incoming block, and the state root that we calculate after inserting the block into the state, are both based on a temporary block header that has a stubbed state root, namely Bytes32(). This allows the block's claimed post-state root to validated without the circularity. The next time that process_slots() is called, the block's stubbed state root is updated to the actual post-state root, as above.

Used by process_slots()
Uses hash_tree_root
See also SLOTS_PER_HISTORICAL_ROOT

Execution engine

Ethereum's "Merge" to proof of stake occurred on the 15th of September 2022. As far as the beacon chain was concerned, the most significant change was that an extra block validity condition now applies. Post-Merge Beacon blocks contain a new ExecutionPayload object which is basically an Eth1 block. For the beacon block to be valid, the contents of its execution payload must be valid according to Ethereum's longstanding block and transaction execution rules (minus any proof of work conditions).

The beacon chain does not know how to validate Ethereum transactions. The entire point of the Merge was to enable beacon chain clients to hand off the validation of the execution payload to a locally connected execution client (formerly an Eth1 client). The beacon chain consensus client does this hand-off via the notify_new_payload() function described below.

Architecturally, the notify_new_payload() function is accessed via a new interface called the Engine API which the Bellatrix specification characterises as follows.

The implementation-dependent ExecutionEngine protocol encapsulates the execution sub-system logic via:

  • a state object self.execution_state of type ExecutionState
  • a notification function self.notify_new_payload which may apply changes to the self.execution_state

Note: notify_new_payload is a function accessed through the EXECUTION_ENGINE module which instantiates the ExecutionEngine protocol.

The body of this function is implementation dependent. The Engine API may be used to implement this and similarly defined functions via an external execution engine.

notify_new_payload

def notify_new_payload(self: ExecutionEngine, execution_payload: ExecutionPayload) -> bool:
    """
    Return ``True`` if and only if ``execution_payload`` is valid with respect to ``self.execution_state``.
    """
    ...

This function is called during block processing to verify the validity of a beacon block's execution payload. The contents of the execution payload are largely opaque to the consensus layer (hence the ... in the function definition) and validation of the execution payload relies almost entirely on the execution client. You can think of it as just an external black-box library call if that helps.

Used by process_execution_payload()

Epoch processing

def process_epoch(state: BeaconState) -> None:
    process_justification_and_finalization(state)  # [Modified in Altair]
    process_inactivity_updates(state)  # [New in Altair]
    process_rewards_and_penalties(state)  # [Modified in Altair]
    process_registry_updates(state)
    process_slashings(state)  # [Modified in Altair]
    process_eth1_data_reset(state)
    process_effective_balance_updates(state)
    process_slashings_reset(state)
    process_randao_mixes_reset(state)
    process_historical_summaries_update(state)  # [Modified in Capella]
    process_participation_flag_updates(state)  # [New in Altair]
    process_sync_committee_updates(state)  # [New in Altair]

The long laundry list of things that need to be done at the end of an epoch. You can see from the comments that a bunch of extra work was added in the Altair upgrade.

Used by process_slots()
Uses All the things below

Justification and finalization

def process_justification_and_finalization(state: BeaconState) -> None:
    # Initial FFG checkpoint values have a `0x00` stub for `root`.
    # Skip FFG updates in the first two epochs to avoid corner cases that might result in modifying this stub.
    if get_current_epoch(state) <= GENESIS_EPOCH + 1:
        return
    previous_indices = get_unslashed_participating_indices(state, TIMELY_TARGET_FLAG_INDEX, get_previous_epoch(state))
    current_indices = get_unslashed_participating_indices(state, TIMELY_TARGET_FLAG_INDEX, get_current_epoch(state))
    total_active_balance = get_total_active_balance(state)
    previous_target_balance = get_total_balance(state, previous_indices)
    current_target_balance = get_total_balance(state, current_indices)
    weigh_justification_and_finalization(state, total_active_balance, previous_target_balance, current_target_balance)

I believe the corner cases mentioned in the comments are related to Issue 84914. In any case, skipping justification and finalisation calculations during the first two epochs definitely simplifies things.

For the purposes of the Casper FFG finality calculations, we want attestations that have both source and target votes we agree with. If the source vote is incorrect, then the attestation is never processed into the state, so we just need the validators that voted for the correct target, according to their participation flag indices.

Since correct target votes can be included up to 32 slots after they are made, we collect votes from both the previous epoch and the current epoch to ensure that we have them all.

Once we know which validators voted for the correct source and head in the current and previous epochs, we add up their effective balances (not actual balances). total_active_balance is the sum of the effective balances for all validators that ought to have voted during the current epoch. Slashed, but not exited validators are not included in these calculations.

These aggregate balances are passed to weigh_justification_and_finalization() to do the actual work of updating justification and finalisation.

Used by process_epoch(), compute_pulled_up_tip
Uses get_unslashed_participating_indices(), get_total_active_balance(), get_total_balance(), weigh_justification_and_finalization()
See also participation flag indices

def weigh_justification_and_finalization(state: BeaconState,
                                         total_active_balance: Gwei,
                                         previous_epoch_target_balance: Gwei,
                                         current_epoch_target_balance: Gwei) -> None:
    previous_epoch = get_previous_epoch(state)
    current_epoch = get_current_epoch(state)
    old_previous_justified_checkpoint = state.previous_justified_checkpoint
    old_current_justified_checkpoint = state.current_justified_checkpoint

    # Process justifications
    state.previous_justified_checkpoint = state.current_justified_checkpoint
    state.justification_bits[1:] = state.justification_bits[:JUSTIFICATION_BITS_LENGTH - 1]
    state.justification_bits[0] = 0b0
    if previous_epoch_target_balance * 3 >= total_active_balance * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=previous_epoch,
                                                        root=get_block_root(state, previous_epoch))
        state.justification_bits[1] = 0b1
    if current_epoch_target_balance * 3 >= total_active_balance * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=current_epoch,
                                                        root=get_block_root(state, current_epoch))
        state.justification_bits[0] = 0b1

    # Process finalizations
    bits = state.justification_bits
    # The 2nd/3rd/4th most recent epochs are justified, the 2nd using the 4th as source
    if all(bits[1:4]) and old_previous_justified_checkpoint.epoch + 3 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 2nd/3rd most recent epochs are justified, the 2nd using the 3rd as source
    if all(bits[1:3]) and old_previous_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 1st/2nd/3rd most recent epochs are justified, the 1st using the 3rd as source
    if all(bits[0:3]) and old_current_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint
    # The 1st/2nd most recent epochs are justified, the 1st using the 2nd as source
    if all(bits[0:2]) and old_current_justified_checkpoint.epoch + 1 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint

This routine handles justification first, and then finalisation.

Justification

A supermajority link is a vote with a justified source checkpoint CmC_m and a target checkpoint CnC_n that was made by validators controlling more than two-thirds of the stake. If a checkpoint has a supermajority link pointing to it then we consider it justified. So, if more than two-thirds of the validators agree that checkpoint 3 was justified (their source vote) and have checkpoint 4 as their target vote, then we justify checkpoint 4.

We know that all the attestations have source votes that we agree with. The first if statement tries to justify the previous epoch's checkpoint seeing if the (source, target) pair is a supermajority. The second if statement tries to justify the current epoch's checkpoint. Note that the previous epoch's checkpoint might already have been justified; this is not checked but does not affect the logic.

The justification status of the last four epochs is stored in an array of bits in the state. After shifting the bits along by one at the outset of the routine, the justification status of the current epoch is stored in element 0, the previous in element 1, and so on.

Note that the total_active_balance is the current epoch's total balance, so it may not be strictly correct for calculating the supermajority for the previous epoch. However, the rate at which the validator set can change between epochs is tightly constrained, so this is not a significant issue.

Finalisation

The version of Casper FFG described in the Gasper paper uses kk-finality, which extends the handling of finality in the original Casper FFG paper. See the k-finality section in the chapter on Consensus for more on how it interacts with the safety guarantees of Casper FFG.

In kk-finality, if we have a consecutive set of kk justified checkpoints Cj,,Cj+k1{C_j, \ldots, C_{j+k-1}}, and a supermajority link from CjC_j to Cj+kC_{j+k}, then CjC_j is finalised. Also note that this justifies Cj+kC_{j+k}, by the rules above.

The Casper FFG version of this is 11-finality. So, a supermajority link from a justified checkpoint CnC_n to the very next checkpoint Cn+1C_{n+1} both justifies Cn+1C_{n+1} and finalises CnC_n.

On the beacon chain we are using 22-finality, since target votes may be included up to an epoch late. In 22-finality, we keep records of checkpoint justification status for four epochs and have the following conditions for finalisation, where the checkpoint for the current epoch is CnC_n. Note that we have already updated the justification status of CnC_n and Cn1C_{n-1} in this routine, which implies the existence of supermajority links pointing to them if the corresponding bits are set, respectively.

  1. Checkpoints Cn3C_{n-3} and Cn2C_{n-2} are justified, and there is a supermajority link from Cn3C_{n-3} to Cn1C_{n-1}: finalise Cn3C_{n-3}.
  2. Checkpoint Cn2C_{n-2} is justified, and there is a supermajority link from Cn2C_{n-2} to Cn1C_{n-1}: finalise Cn2C_{n-2}. This is equivalent to 11-finality applied to the previous epoch.
  3. Checkpoints Cn2C_{n-2} and Cn1C_{n-1} are justified, and there is a supermajority link from Cn2C_{n-2} to CnC_n: finalise Cn2C_{n-2}.
  4. Checkpoint Cn1C_{n-1} is justified, and there is a supermajority link from Cn1C_{n-1} to CnC_n: finalise Cn1C_{n-1}. This is equivalent to 11-finality applied to the current epoch.

A diagram of the four 2-finality scenarios.

The four cases of 2-finality. In each case the supermajority link causes the checkpoint at its start (the source) to become finalised and the checkpoint at its end (the target) to become justified. Checkpoint numbers are along the bottom.

Almost always we would expect to see only the 11-finality cases, in particular, case 4. The 22-finality cases would occur only in situations where many attestations are delayed, or when we are very close to the 2/3rds participation threshold. Note that these evaluations stack, so it is possible for rule 2 to finalise Cn2C_{n-2} and then for rule 4 to immediately finalise Cn1C_{n-1}, for example.

For the uninitiated, in Python's array slice syntax, bits[1:4] means bits 1, 2, and 3 (but not 4). This always trips me up.

Used by process_justification_and_finalization()
Uses get_block_root()
See also JUSTIFICATION_BITS_LENGTH, Checkpoint

Inactivity scores

def process_inactivity_updates(state: BeaconState) -> None:
    # Skip the genesis epoch as score updates are based on the previous epoch participation
    if get_current_epoch(state) == GENESIS_EPOCH:
        return

    for index in get_eligible_validator_indices(state):
        # Increase the inactivity score of inactive validators
        if index in get_unslashed_participating_indices(state, TIMELY_TARGET_FLAG_INDEX, get_previous_epoch(state)):
            state.inactivity_scores[index] -= min(1, state.inactivity_scores[index])
        else:
            state.inactivity_scores[index] += INACTIVITY_SCORE_BIAS
        # Decrease the inactivity score of all eligible validators during a leak-free epoch
        if not is_in_inactivity_leak(state):
            state.inactivity_scores[index] -= min(INACTIVITY_SCORE_RECOVERY_RATE, state.inactivity_scores[index])

Since the Altair upgrade, each validator has an individual inactivity score in the beacon state which is updated as follows.

  • At the end of epoch NN, irrespective of the inactivity leak,
  • When not in an inactivity leak

Flowchart showing how inactivity score updates are calculated.

How each validator's inactivity score is updated. The happy flow is right through the middle. "Active", when updating the scores at the end of epoch NN, means having made a correct and timely target vote in epoch N1N-1.

There is a floor of zero on the score. So, outside a leak, validators' scores will rapidly return to zero and stay there, since INACTIVITY_SCORE_RECOVERY_RATE is greater than INACTIVITY_SCORE_BIAS.

Used by process_epoch()
Uses get_eligible_validator_indices(), get_unslashed_participating_indices(), is_in_inactivity_leak()
See also INACTIVITY_SCORE_BIAS, INACTIVITY_SCORE_RECOVERY_RATE, INACTIVITY_SCORE_RECOVERY_RATE

Reward and penalty calculations

Without wanting to go full Yellow Paper on you, I am going to adopt a little notation to help analyse the rewards.

We will define a base reward BB that we will see turns out to be the expected long-run average income of an optimally performing validator per epoch (ignoring validator set size changes). The total number of active validators is NN.

The base reward is calculated from a base reward per increment, bb. An "increment" is a unit of effective balance in terms of EFFECTIVE_BALANCE_INCREMENT. B=32bB = 32b because MAX_EFFECTIVE_BALANCE = 32 * EFFECTIVE_BALANCE_INCREMENT

Other quantities we will use in rewards calculation are the incentivization weights: WsW_s, WtW_t, WhW_h, and WyW_y being the weights for correct source, target, head, and sync committee votes respectively; WpW_p being the proposer weight; and the weight denominator WΣW_{\Sigma} which is the sum of the weights.

Issuance for regular rewards happens in four ways:

  • IAI_A is the maximum total reward for all validators attesting in an epoch;
  • IAPI_{A_P} is the maximum reward issued to proposers in an epoch for including attestations;
  • ISI_S is the maximum total reward for all sync committee participants in an epoch; and
  • ISPI_{S_P} is the maximum reward issued to proposers in an epoch for including sync aggregates;

Under get_flag_index_deltas(), process_attestation(), and process_sync_aggregate() we find that these work out as follows in terms of BB and NN:

IA=Ws+Wt+WhWΣNBIAP=WpWΣWpIAIS=WyWΣNBISP=WpWΣWpIS\begin{aligned} &I_A = \frac{W_s + W_t + W_h}{W_{\Sigma}}NB \\ &I_{A_P} = \frac{W_p}{W_{\Sigma} - W_p}I_A \\ &I_S = \frac{W_y}{W_{\Sigma}}NB \\ &I_{S_P} = \frac{W_p}{W_{\Sigma} - W_p}I_S \end{aligned}

To find the total optimal issuance per epoch, we can first sum IAI_A and ISI_S,

IA+IS=Ws+Wt+Wh+WyWΣNB=WΣWpWΣNBI_A + I_S = \frac{W_s + W_t + W_h + W_y}{W_{\Sigma}}NB = \frac{W_{\Sigma} - W_p}{W_{\Sigma}}NB

Now adding in the proposer rewards,

IA+IS+IAP+ISP=WΣWpWΣ(1+WpWΣWp)NB=(WΣWpWΣ+WpWΣ)NB=NBI_A + I_S + I_{A_P} + I_{S_P} = \frac{W_{\Sigma} - W_p}{W_{\Sigma}}(1 + \frac{W_p}{W_{\Sigma} - W_p})NB = (\frac{W_{\Sigma} - W_p}{W_{\Sigma}} + \frac{W_p}{W_{\Sigma}})NB = NB

So, we see that every epoch, NBNB Gwei is awarded to NN validators. Every validator participates in attesting, and proposing and sync committee duties are uniformly random, so the long-term expected income per optimally performing validator per epoch is BB Gwei.

Helpers

def get_base_reward_per_increment(state: BeaconState) -> Gwei:
    return Gwei(EFFECTIVE_BALANCE_INCREMENT * BASE_REWARD_FACTOR // integer_squareroot(get_total_active_balance(state)))

The base reward per increment is the fundamental unit of reward in terms of which all other regular rewards and penalties are calculated. We will denote the base reward per increment, bb.

As I noted under BASE_REWARD_FACTOR, this is the big knob to turn if we wish to increase or decrease the total reward for participating in Eth2, otherwise known as the issuance rate of new Ether.

An increment is a single unit of a validator's effective balance, denominated in terms of EFFECTIVE_BALANCE_INCREMENT, which happens to be one Ether. So, an increment is 1 Ether of effective balance, and maximally effective validator has 32 increments.

The base reward per increment is inversely proportional to the square root of the total balance of all active validators. This means that, as the number NN of validators increases, the reward per validator decreases as 1N\frac{1}{\sqrt{N}}, and the overall issuance per epoch increases as N\sqrt{N}.

The decrease with increasing NN in per-validator rewards provides a price discovery mechanism: the idea is that an equilibrium will be found where the total number of validators results in a reward similar to returns available elsewhere for similar risk. A different curve could have been chosen for the rewards profile. For example, the inverse of total balance rather than its square root would keep total issuance constant. The section on Issuance has a deeper exploration of these topics.

Used by get_base_reward(), process_sync_aggregate()
Uses integer_squareroot(), get_total_active_balance()

def get_base_reward(state: BeaconState, index: ValidatorIndex) -> Gwei:
    """
    Return the base reward for the validator defined by ``index`` with respect to the current ``state``.
    """
    increments = state.validators[index].effective_balance // EFFECTIVE_BALANCE_INCREMENT
    return Gwei(increments * get_base_reward_per_increment(state))

The base reward is the reward that an optimally performing validator can expect to earn on average per epoch, over the long term. It is proportional to the validator's effective balance; a validator with MAX_EFFECTIVE_BALANCE can expect to receive the full base reward B=32bB = 32b per epoch on a long-term average.

Used by get_flag_index_deltas(), process_attestation()
Uses get_base_reward_per_increment()
See also EFFECTIVE_BALANCE_INCREMENT

def get_finality_delay(state: BeaconState) -> uint64:
    return get_previous_epoch(state) - state.finalized_checkpoint.epoch

Returns the number of epochs since the last finalised checkpoint (minus one). In ideal running this ought to be zero: during epoch processing we aim to have justified the checkpoint in the current epoch and finalised the checkpoint in the previous epoch. A delay in finalisation suggests a chain split or a large fraction of validators going offline.

Used by is_in_inactivity_leak()

def is_in_inactivity_leak(state: BeaconState) -> bool:
    return get_finality_delay(state) > MIN_EPOCHS_TO_INACTIVITY_PENALTY

If the beacon chain has not managed to finalise a checkpoint for MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs (that is, four epochs), then the chain enters the inactivity leak. In this mode, penalties for non-participation are heavily increased, with the goal of reducing the proportion of stake controlled by non-participants, and eventually regaining finality.

Used by get_flag_index_deltas(), process_inactivity_updates()
Uses get_finality_delay()
See also inactivity leak, MIN_EPOCHS_TO_INACTIVITY_PENALTY

def get_eligible_validator_indices(state: BeaconState) -> Sequence[ValidatorIndex]:
    previous_epoch = get_previous_epoch(state)
    return [
        ValidatorIndex(index) for index, v in enumerate(state.validators)
        if is_active_validator(v, previous_epoch) or (v.slashed and previous_epoch + 1 < v.withdrawable_epoch)
    ]

These are the validators that were subject to rewards and penalties in the previous epoch.

The list differs from the active validator set returned by get_active_validator_indices() by including slashed but not fully exited validators in addition to the ones marked active. Slashed validators are subject to penalties right up to when they become withdrawable and are thus fully exited.

Used by get_flag_index_deltas(), process_inactivity_updates(), get_inactivity_penalty_deltas()
Uses is_active_validator()
Inactivity penalty deltas

def get_inactivity_penalty_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return the inactivity penalty deltas by considering timely target participation flags and inactivity scores.
    """
    rewards = [Gwei(0) for _ in range(len(state.validators))]
    penalties = [Gwei(0) for _ in range(len(state.validators))]
    previous_epoch = get_previous_epoch(state)
    matching_target_indices = get_unslashed_participating_indices(state, TIMELY_TARGET_FLAG_INDEX, previous_epoch)
    for index in get_eligible_validator_indices(state):
        if index not in matching_target_indices:
            penalty_numerator = state.validators[index].effective_balance * state.inactivity_scores[index]
            penalty_denominator = INACTIVITY_SCORE_BIAS * INACTIVITY_PENALTY_QUOTIENT_BELLATRIX
            penalties[index] += Gwei(penalty_numerator // penalty_denominator)
    return rewards, penalties

Validators receive penalties proportional to their individual inactivity scores, even when the beacon chain is not in an inactivity leak. However, these scores reduce to zero fairly rapidly outside a leak. This is a change from Phase 0 in which inactivity penalties were applied only during leaks.

All unslashed validators that made a correct and timely target vote in the previous epoch are identified by get_unslashed_participating_indices(), and all other active validators receive a penalty, including slashed validators.

The penalty is proportional to the validator's effective balance and its inactivity score. See INACTIVITY_PENALTY_QUOTIENT_BELLATRIX for more details of the calculation, and INACTIVITY_SCORE_RECOVERY_RATE for some charts of how the penalties accrue.

The returned rewards array always contains only zeros. It's here just to make the Python syntax simpler in the calling routine.

Used by def_process_rewards_and_penalties()
Uses get_unslashed_participating_indices(), get_eligible_validator_indices()
See also Inactivity Scores, INACTIVITY_PENALTY_QUOTIENT_BELLATRIX, INACTIVITY_SCORE_RECOVERY_RATE
Process rewards and penalties

def process_rewards_and_penalties(state: BeaconState) -> None:
    # No rewards are applied at the end of `GENESIS_EPOCH` because rewards are for work done in the previous epoch
    if get_current_epoch(state) == GENESIS_EPOCH:
        return

    flag_deltas = [get_flag_index_deltas(state, flag_index) for flag_index in range(len(PARTICIPATION_FLAG_WEIGHTS))]
    deltas = flag_deltas + [get_inactivity_penalty_deltas(state)]
    for (rewards, penalties) in deltas:
        for index in range(len(state.validators)):
            increase_balance(state, ValidatorIndex(index), rewards[index])
            decrease_balance(state, ValidatorIndex(index), penalties[index])

This is where validators are rewarded and penalised according to their attestation records.

Attestations included in beacon blocks were processed by process_attestation as blocks were received, and flags were set in the beacon state according to their timeliness and correctness. These flags are now processed into rewards and penalties for each validator by calling get_flag_index_deltas() for each of the flag types.

Once the normal attestation rewards and penalties have been calculated, additional penalties based on validators' inactivity scores are accumulated.

As noted elsewhere, rewards and penalties are handled separately from each other since we don't do negative numbers.

For reference, the only other places where rewards and penalties are applied are as follows:

Used by process_epoch()
Uses get_flag_index_deltas(), get_inactivity_penalty_deltas(), increase_balance(), decrease_balance()
See also ParticipationFlags, PARTICIPATION_FLAG_WEIGHTS

Registry updates

def process_registry_updates(state: BeaconState) -> None:
    # Process activation eligibility and ejections
    for index, validator in enumerate(state.validators):
        if is_eligible_for_activation_queue(validator):
            validator.activation_eligibility_epoch = get_current_epoch(state) + 1

        if (
            is_active_validator(validator, get_current_epoch(state))
            and validator.effective_balance <= EJECTION_BALANCE
        ):
            initiate_validator_exit(state, ValidatorIndex(index))

    # Queue validators eligible for activation and not yet dequeued for activation
    activation_queue = sorted([
        index for index, validator in enumerate(state.validators)
        if is_eligible_for_activation(state, validator)
        # Order by the sequence of activation_eligibility_epoch setting and then index
    ], key=lambda index: (state.validators[index].activation_eligibility_epoch, index))
    # Dequeued validators for activation up to churn limit
    for index in activation_queue[:get_validator_churn_limit(state)]:
        validator = state.validators[index]
        validator.activation_epoch = compute_activation_exit_epoch(get_current_epoch(state))

The Registry is the part of the beacon state that stores Validator records. These particular updates are, for the most part, concerned with moving validators through the activation queue.

is_eligible_for_activation_queue() finds validators that have a sufficient deposit amount yet their activation_eligibility_epoch is still set to FAR_FUTURE_EPOCH. These will be at most the validators for which deposits were processed during the last epoch, potentially up to MAX_DEPOSITS * SLOTS_PER_EPOCH, which is 512 (minus any partial deposits that don't yet add up to a whole deposit). These have their activation_eligibility_epoch set to the next epoch. They will become eligible for activation once that epoch is finalised – "eligible for activation" means only that they can be added to the activation queue; they will not become active until they reach the end of the queue.

Next, any validators whose effective balance has fallen to EJECTION_BALANCE have their exit initiated.

is_eligible_for_activation() selects validators whose activation_eligibility_epoch has just been finalised. The list of these is ordered by eligibility epoch, and then by index. There might be multiple eligibility epochs in the list if finalisation got delayed for some reason.

Finally, the first get_validator_churn_limit() validators in the list get their activation epochs set to compute_activation_exit_epoch().

On first sight, you'd think that the activation epochs of the whole queue could be set here, rather than just a single epoch's worth. But at some point, get_validator_churn_limit() will change unpredictably (we don't know when validators will exit), which makes that infeasible. Though, curiously, that is exactly what initiate_validator_exit() does. Anyway, clients could optimise this by persisting the sorted activation queue rather than recalculating it.

Used by process_epoch()
Uses is_eligible_for_activation_queue(), is_active_validator(), initiate_validator_exit(), is_eligible_for_activation(), get_validator_churn_limit(), compute_activation_exit_epoch()
See also Validator, EJECTION_BALANCE

Slashings

def process_slashings(state: BeaconState) -> None:
    epoch = get_current_epoch(state)
    total_balance = get_total_active_balance(state)
    adjusted_total_slashing_balance = min(
        sum(state.slashings) * PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX,
        total_balance
    )
    for index, validator in enumerate(state.validators):
        if validator.slashed and epoch + EPOCHS_PER_SLASHINGS_VECTOR // 2 == validator.withdrawable_epoch:
            increment = EFFECTIVE_BALANCE_INCREMENT  # Factored out from penalty numerator to avoid uint64 overflow
            penalty_numerator = validator.effective_balance // increment * adjusted_total_slashing_balance
            penalty = penalty_numerator // total_balance * increment
            decrease_balance(state, ValidatorIndex(index), penalty)

Slashing penalties are applied in two stages: the first stage is in slash_validator(), immediately on detection; the second stage is here.

In slash_validator() the withdrawable epoch is set EPOCHS_PER_SLASHINGS_VECTOR in the future, so in this function we are considering all slashed validators that are halfway to being withdrawable, that is, completely exited from the protocol. Equivalently, they were slashed EPOCHS_PER_SLASHINGS_VECTOR // 2 epochs ago (about 18 days).

To calculate the additional slashing penalty, we do the following:

  1. Find the sum of the effective balances (at the time of the slashing) of all validators that were slashed in the previous EPOCHS_PER_SLASHINGS_VECTOR epochs (36 days). These are stored as a vector in the state.
  2. Multiply this sum by PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX, but cap the result at total_balance, the total active balance of all validators.
  3. For each slashed validator being considered, multiply its effective balance by the result of #2 and then divide by the total_balance. This results in an amount between zero and the full effective balance of the validator. That amount is subtracted from its actual balance as the penalty. Note that the effective balance could exceed the actual balance in odd corner cases, but decrease_balance() ensures the balance does not go negative.

If only a single validator were slashed within the 36 days, then this secondary penalty is tiny (actually zero, see below). If one-third of validators are slashed (the minimum required to finalise conflicting blocks), then, with PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX set to three, a successful chain attack will result in the attackers losing their entire effective balances.

Interestingly, due to the way the integer arithmetic is constructed in this routine, in particular the factoring out of increment, the result of this calculation will be zero if validator.effective_balance * adjusted_total_slashing_balance is less than total_balance. Effectively, the penalty is rounded down to the nearest whole amount of Ether. Issues 1322 and 2161 discuss this. In the end, the consequence is that when there are few slashings there is no extra correlated slashing penalty at all, which is probably a good thing.

Used by process_epoch()
Uses get_total_active_balance(), decrease_balance()
See also slash_validator(), EPOCHS_PER_SLASHINGS_VECTOR, PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX

Eth1 data votes updates

def process_eth1_data_reset(state: BeaconState) -> None:
    next_epoch = Epoch(get_current_epoch(state) + 1)
    # Reset eth1 data votes
    if next_epoch % EPOCHS_PER_ETH1_VOTING_PERIOD == 0:
        state.eth1_data_votes = []

There is a fixed period during which beacon block proposers vote on their view of the Eth1 deposit contract and try to come to a simple majority agreement. At the end of the period, the record of votes is cleared and voting begins again, whether or not agreement was reached during the period.

Used by process_epoch()
See also EPOCHS_PER_ETH1_VOTING_PERIOD, Eth1Data

Effective balances updates

def process_effective_balance_updates(state: BeaconState) -> None:
    # Update effective balances with hysteresis
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        HYSTERESIS_INCREMENT = uint64(EFFECTIVE_BALANCE_INCREMENT // HYSTERESIS_QUOTIENT)
        DOWNWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_DOWNWARD_MULTIPLIER
        UPWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_UPWARD_MULTIPLIER
        if (
            balance + DOWNWARD_THRESHOLD < validator.effective_balance
            or validator.effective_balance + UPWARD_THRESHOLD < balance
        ):
            validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)

Each validator's balance is represented twice in the state: once accurately in a list separate from validator records, and once in a coarse-grained format within the validator's record. Only effective balances are used in calculations within the spec, but rewards and penalties are applied to actual balances. This routine is where effective balances are updated once per epoch to follow the actual balances.

A hysteresis mechanism is used when calculating the effective balance of a validator when its actual balance changes. See Hysteresis Parameters for more discussion of this, and the values of the related constants. With the current values, a validator's effective balance drops to X ETH when its actual balance drops below X.75 ETH, and increases to Y ETH when its actual balance rises above Y.25 ETH. The hysteresis mechanism ensures that effective balances change infrequently, which means that the list of validator records needs to be re-hashed only infrequently when calculating the state root, saving considerably on work.

Used by process_epoch()
See also Hysteresis Parameters

Slashings balances updates

def process_slashings_reset(state: BeaconState) -> None:
    next_epoch = Epoch(get_current_epoch(state) + 1)
    # Reset slashings
    state.slashings[next_epoch % EPOCHS_PER_SLASHINGS_VECTOR] = Gwei(0)

state.slashings is a circular list of length EPOCHS_PER_SLASHINGS_VECTOR that contains the total of the effective balances of all validators that have been slashed at each epoch. These are used to apply a correlated slashing penalty to slashed validators before they are exited. Each epoch we overwrite the oldest entry with zero, and it becomes the current entry.

Used by process_epoch()
See also process_slashings(), EPOCHS_PER_SLASHINGS_VECTOR

Randao mixes updates

def process_randao_mixes_reset(state: BeaconState) -> None:
    current_epoch = get_current_epoch(state)
    next_epoch = Epoch(current_epoch + 1)
    # Set randao mix
    state.randao_mixes[next_epoch % EPOCHS_PER_HISTORICAL_VECTOR] = get_randao_mix(state, current_epoch)

state.randao_mixes is a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. The current value of the RANDAO, which is updated with every block that arrives, is stored at position state.randao_mixes[current_epoch % EPOCHS_PER_HISTORICAL_VECTOR], as per get_randao_mix().

At the end of every epoch, final value of the RANDAO for this epoch is copied over to become the starting value of the randao for the next, preserving the remaining entries as historical values.

Used by process_epoch()
Uses get_randao_mix()
See also process_randao(), EPOCHS_PER_HISTORICAL_VECTOR

Historical summaries updates

def process_historical_summaries_update(state: BeaconState) -> None:
    # Set historical block root accumulator.
    next_epoch = Epoch(get_current_epoch(state) + 1)
    if next_epoch % (SLOTS_PER_HISTORICAL_ROOT // SLOTS_PER_EPOCH) == 0:
        historical_summary = HistoricalSummary(
            block_summary_root=hash_tree_root(state.block_roots),
            state_summary_root=hash_tree_root(state.state_roots),
        )
        state.historical_summaries.append(historical_summary)

This routine replaced process_historical_roots_update() at the Capella upgrade.

Previously, both the state.block_roots and state.state_roots lists were Merkleized together into a single root before being added to the state.historical_roots double batched accumulator. Now they are separately Merkleized and appended to state.historical_summaries via the HistoricalSummary container. The Capella upgrade changed this to make it possible to validate past block history without having to know the state history.

The summary is appended to the list every SLOTS_PER_HISTORICAL_ROOT slots. At 64 bytes per summary, the list will grow at the rate of 20 KB per year. The corresponding block and state root lists in the beacon state are circular and just get overwritten in the next period.

The process_historical_roots_update() function that this replaces remains documented in the Bellatrix edition.

Used by process_epoch()
See also HistoricalSummary, SLOTS_PER_HISTORICAL_ROOT

Participation flags updates

def process_participation_flag_updates(state: BeaconState) -> None:
    state.previous_epoch_participation = state.current_epoch_participation
    state.current_epoch_participation = [ParticipationFlags(0b0000_0000) for _ in range(len(state.validators))]

Two epochs' worth of validator participation flags (that record validators' attestation activity) are stored. At the end of every epoch the current becomes the previous, and a new empty list becomes current.

Used by process_epoch()
See also ParticipationFlags

Sync committee updates

def process_sync_committee_updates(state: BeaconState) -> None:
    next_epoch = get_current_epoch(state) + Epoch(1)
    if next_epoch % EPOCHS_PER_SYNC_COMMITTEE_PERIOD == 0:
        state.current_sync_committee = state.next_sync_committee
        state.next_sync_committee = get_next_sync_committee(state)

Sync committees are rotated every EPOCHS_PER_SYNC_COMMITTEE_PERIOD. The next sync committee is ready and waiting so that validators can prepare in advance by subscribing to the necessary subnets. That becomes the current sync committee, and the next is calculated.

Used by process_epoch()
Uses get_next_sync_committee()
See also EPOCHS_PER_SYNC_COMMITTEE_PERIOD

Block processing

def process_block(state: BeaconState, block: BeaconBlock) -> None:
    process_block_header(state, block)
    if is_execution_enabled(state, block.body):
        process_withdrawals(state, block.body.execution_payload)  # [New in Capella]
        process_execution_payload(state, block.body.execution_payload, EXECUTION_ENGINE)  # [Modified in Capella]
    process_randao(state, block.body)
    process_eth1_data(state, block.body)
    process_operations(state, block.body)  # [Modified in Capella]
    process_sync_aggregate(state, block.body.sync_aggregate)

These are the tasks that the beacon node performs in order to process a block and update the state. If any of the called functions triggers the failure of an assert statement, or any other kind of exception, then the entire block is invalid, and any state changes must be rolled back.

Note: The call to the process_execution_payload must happen before the call to the process_randao as the former depends on the randao_mix computed with the reveal of the previous block.

The call to process_execution_payload() was added in the Bellatrix pre-Merge upgrade. The EXECUTION_ENGINE object is not really defined in the beacon chain spec, but corresponds to an API that calls out to an attached execution client (formerly Eth1 client) that will do most of the payload validation.

process_operations() covers the processing of any slashing reports (proposer and attester) in the block, any attestations, any deposits, and any voluntary exits.

Used by state_transition()
Uses process_block_header(), is_execution_enabled(), process_execution_payload(), process_randao(), process_eth1_data(), process_operations(), process_sync_aggregate()

Block header

def process_block_header(state: BeaconState, block: BeaconBlock) -> None:
    # Verify that the slots match
    assert block.slot == state.slot
    # Verify that the block is newer than latest block header
    assert block.slot > state.latest_block_header.slot
    # Verify that proposer index is the correct index
    assert block.proposer_index == get_beacon_proposer_index(state)
    # Verify that the parent matches
    assert block.parent_root == hash_tree_root(state.latest_block_header)
    # Cache current block as the new latest block
    state.latest_block_header = BeaconBlockHeader(
        slot=block.slot,
        proposer_index=block.proposer_index,
        parent_root=block.parent_root,
        state_root=Bytes32(),  # Overwritten in the next process_slot call
        body_root=hash_tree_root(block.body),
    )

    # Verify proposer is not slashed
    proposer = state.validators[block.proposer_index]
    assert not proposer.slashed

A straightforward set of validity conditions for the block header data.

The version of the block header object that this routine stores in the state is a duplicate of the incoming block's header, but with its state_root set to its default empty Bytes32() value. See process_slot() for the explanation of this.

Used by process_block()
Uses get_beacon_proposer_index(), hash_tree_root()
See also BeaconBlockHeader, process_slot()

Withdrawals

get_expected_withdrawals

def get_expected_withdrawals(state: BeaconState) -> Sequence[Withdrawal]:
    epoch = get_current_epoch(state)
    withdrawal_index = state.next_withdrawal_index
    validator_index = state.next_withdrawal_validator_index
    withdrawals: List[Withdrawal] = []
    bound = min(len(state.validators), MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP)
    for _ in range(bound):
        validator = state.validators[validator_index]
        balance = state.balances[validator_index]
        if is_fully_withdrawable_validator(validator, balance, epoch):
            withdrawals.append(Withdrawal(
                index=withdrawal_index,
                validator_index=validator_index,
                address=ExecutionAddress(validator.withdrawal_credentials[12:]),
                amount=balance,
            ))
            withdrawal_index += WithdrawalIndex(1)
        elif is_partially_withdrawable_validator(validator, balance):
            withdrawals.append(Withdrawal(
                index=withdrawal_index,
                validator_index=validator_index,
                address=ExecutionAddress(validator.withdrawal_credentials[12:]),
                amount=balance - MAX_EFFECTIVE_BALANCE,
            ))
            withdrawal_index += WithdrawalIndex(1)
        if len(withdrawals) == MAX_WITHDRAWALS_PER_PAYLOAD:
            break
        validator_index = ValidatorIndex((validator_index + 1) % len(state.validators))
    return withdrawals

This is used in both block processing and block building to construct the list of automatic validator withdrawals that we expect to see in the block.

At most MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP validators will be considered for a withdrawal. As described under that heading, this serves to bound the load on nodes when eligible validators are few and far between.

Picking up where the previous sweep left off (state.next_withdrawal_validator_index), we consider validators in turn, in increasing order of their validator indices. If a validator is eligible for a full withdrawal then a withdrawal transaction for its entire balance is added to the list. If a validator is eligible for a partial_withdrawal then a withdrawal transaction for its excess balance above MAX_EFFECTIVE_BALANCE is added to the list.

Each withdrawal transaction is associated with a unique, consecutive withdrawal index, which is the total number of previous withdrawals.

Once either MAX_WITHDRAWALS_PER_PAYLOAD transactions have been assembled, or MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP have been considered, the sweep terminates and returns the list of transactions.

The next_withdrawal_index and next_withdrawal_validator_index counters in the beacon state are not updated here, but in the calling function.

Used by process_withdrawals()
Uses is_fully_withdrawable_validator(), is_partially_withdrawable_validator()
See also MAX_WITHDRAWALS_PER_PAYLOAD, MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP, Withdrawal
process_withdrawals

def process_withdrawals(state: BeaconState, payload: ExecutionPayload) -> None:
    expected_withdrawals = get_expected_withdrawals(state)
    assert len(payload.withdrawals) == len(expected_withdrawals)

    for expected_withdrawal, withdrawal in zip(expected_withdrawals, payload.withdrawals):
        assert withdrawal == expected_withdrawal
        decrease_balance(state, withdrawal.validator_index, withdrawal.amount)

    # Update the next withdrawal index if this block contained withdrawals
    if len(expected_withdrawals) != 0:
        latest_withdrawal = expected_withdrawals[-1]
        state.next_withdrawal_index = WithdrawalIndex(latest_withdrawal.index + 1)

    # Update the next validator index to start the next withdrawal sweep
    if len(expected_withdrawals) == MAX_WITHDRAWALS_PER_PAYLOAD:
        # Next sweep starts after the latest withdrawal's validator index
        next_validator_index = ValidatorIndex((expected_withdrawals[-1].validator_index + 1) % len(state.validators))
        state.next_withdrawal_validator_index = next_validator_index
    else:
        # Advance sweep by the max length of the sweep if there was not a full set of withdrawals
        next_index = state.next_withdrawal_validator_index + MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP
        next_validator_index = ValidatorIndex(next_index % len(state.validators))
        state.next_withdrawal_validator_index = next_validator_index

The withdrawal transactions in a block appear in its ExecutionPayload since they span both the consensus and execution layers. When processing the withdrawals, we first check that they match what we expect to see. This is taken care of by the call to get_expected_withdrawals() and the pairwise comparison within the for loop15. If any of the assert tests fails then the entire block is invalid and all changes, including balance updates already made, must be rolled back. For each withdrawal, the corresponding validator's balance is decreased; the execution client will add the same amount to the validator's Eth1 withdrawal address on the execution layer.

After that we have some trickery for updating the values of next_withdrawal_index and next_withdrawal_validator_index in the beacon state.

For next_withdrawal_index, which just counts the number of withdrawals every made, we take the index of the last withdrawal in our list and add one. Adding the length of the list to our current value would be equivalent.

For next_withdrawal_validator_index, we have two cases. If we have a full list of MAX_WITHDRAWALS_PER_PAYLOAD withdrawal transactions then we know that this is the condition that terminated the sweep. Therefore the first validator we need to consider next time is the one after the validator in the last withdrawal transaction. Otherwise, the sweep was terminated by reaching MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP, and the first validator we need to consider next time is the one after that.

I can't help thinking that it would have been easier to return these both from get_expected_withdrawals(), where they have just been calculated independently.

Used by process_block()
Uses get_expected_withdrawals(), decrease_balance()
See also WithdrawalIndex, ValidatorIndex, MAX_WITHDRAWALS_PER_PAYLOAD, MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP

Execution payload

process_execution_payload

def process_execution_payload(state: BeaconState, payload: ExecutionPayload, execution_engine: ExecutionEngine) -> None:
    # Verify consistency of the parent hash with respect to the previous execution payload header
    if is_merge_transition_complete(state):
        assert payload.parent_hash == state.latest_execution_payload_header.block_hash
    # Verify prev_randao
    assert payload.prev_randao == get_randao_mix(state, get_current_epoch(state))
    # Verify timestamp
    assert payload.timestamp == compute_timestamp_at_slot(state, state.slot)
    # Verify the execution payload is valid
    assert execution_engine.notify_new_payload(payload)
    # Cache execution payload header
    state.latest_execution_payload_header = ExecutionPayloadHeader(
        parent_hash=payload.parent_hash,
        fee_recipient=payload.fee_recipient,
        state_root=payload.state_root,
        receipts_root=payload.receipts_root,
        logs_bloom=payload.logs_bloom,
        prev_randao=payload.prev_randao,
        block_number=payload.block_number,
        gas_limit=payload.gas_limit,
        gas_used=payload.gas_used,
        timestamp=payload.timestamp,
        extra_data=payload.extra_data,
        base_fee_per_gas=payload.base_fee_per_gas,
        block_hash=payload.block_hash,
        transactions_root=hash_tree_root(payload.transactions),
        withdrawals_root=hash_tree_root(payload.withdrawals),  # [New in Capella]
    )

Since the Merge, the execution payload (formerly an Eth1 block) now forms part of the beacon block.

There isn't much beacon chain processing to be done for execution payloads as they are for the most part opaque blobs of data that are meaningful only to the execution client. However, the beacon chain does need to know whether the execution payload is valid in the view of the execution client. An execution payload that is invalid by the rules of the execution (Eth1) chain makes the beacon block containing it invalid.

Some initial sanity checks are performed:

  • Unless this is the very first execution payload that we have seen then its parent_hash must match the block_hash that we have in the beacon state, that of the last execution payload we processed. This ensures that the chain of execution payloads is continuous, since it is essentially a blockchain within a blockchain.
  • We check that the prev_randao value is correctly set, otherwise a block proposer could trivially control the randomness on the execution layer.
  • The timestamp on the execution payload must match the slot timestamp. Again, this prevents proposers manipulating the execution layer time for any smart contracts that depend on it.

Next we send the payload over to the execution engine via the Engine API, using the notify_new_payload() function it provides. This serves two purposes: first it requests that the execution client check the validity of the payload, and second, if the payload is valid, it allows the execution layer to update its own state by running the transactions contained in the payload.

Finally, the header of the execution payload is stored in the beacon state, primarily so that the block_hashparent_hash check can be made next time this function is called. The remainder of the execution header data is not currently used in the beacon chain specification, despite being stored.

This function was added in the Bellatrix pre-Merge upgrade.

Used by process_block()
Uses is_merge_transition_complete(), get_randao_mix(), compute_timestamp_at_slot(), notify_new_payload(), hash_tree_root()
See also ExecutionPayloadHeader

RANDAO

def process_randao(state: BeaconState, body: BeaconBlockBody) -> None:
    epoch = get_current_epoch(state)
    # Verify RANDAO reveal
    proposer = state.validators[get_beacon_proposer_index(state)]
    signing_root = compute_signing_root(epoch, get_domain(state, DOMAIN_RANDAO))
    assert bls.Verify(proposer.pubkey, signing_root, body.randao_reveal)
    # Mix in RANDAO reveal
    mix = xor(get_randao_mix(state, epoch), hash(body.randao_reveal))
    state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR] = mix

A good source of randomness is foundational to the operation of the beacon chain. Security of the protocol depends significantly on being able to unpredictably and uniformly select block proposers and committee members. In fact, the very name "beacon chain" was inspired by Dfinity's concept of a randomness beacon.

The current mechanism for providing randomness is a RANDAO, in which each block proposer provides some randomness and all the contributions are mixed together over the course of an epoch. This is not unbiasable (a malicious proposer may choose to skip a block if it is to its advantage to do so), but is good enough. In future, Ethereum might use a verifiable delay function (VDF) to provide unbiasable randomness.

Early designs had the validators pre-committing to "hash onions", peeling off one layer of hashing at each block proposal. This was changed to using a BLS signature over the epoch number as the entropy source. Using signatures is both a simplification, and an enabler for multi-party (distributed) validators. The (reasonable) assumption is that sufficient numbers of validators generated their secret keys with good entropy to ensure that the RANDAO's entropy is adequate.

The process_randao() function simply uses the proposer's public key to verify that the RANDAO reveal in the block is indeed the epoch number signed with the proposer's private key. It then mixes the hash of the reveal into the current epoch's RANDAO accumulator. The hash is used in order to reduce the signature down from 96 to 32 bytes, and to make it uniform. EPOCHS_PER_HISTORICAL_VECTOR past values of the RANDAO accumulator at the ends of epochs are stored in the state.

From Justin Drake's notes:

Using xor in process_randao is (slightly) more secure than using hash. To illustrate why, imagine an attacker can grind randomness in the current epoch such that two of his validators are the last proposers, in a different order, in two resulting samplings of the next epochs. The commutativity of xor makes those two samplings equivalent, hence reducing the attacker's grinding opportunity for the next epoch versus hash (which is not commutative). The strict security improvement may simplify the derivation of RANDAO security formal lower bounds.

Note that the assert statement means that the whole block is invalid if the RANDAO reveal is incorrectly formed.

Used by process_block()
Uses get_beacon_proposer_index(), compute_signing_root(), get_domain(), bls.Verify(), hash(), xor(), get_randao_mix()
See also EPOCHS_PER_HISTORICAL_VECTOR

Eth1 data

def process_eth1_data(state: BeaconState, body: BeaconBlockBody) -> None:
    state.eth1_data_votes.append(body.eth1_data)
    if state.eth1_data_votes.count(body.eth1_data) * 2 > EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH:
        state.eth1_data = body.eth1_data

Blocks may contain Eth1Data which is supposed to be the proposer's best view of the Eth1 chain and the deposit contract at the time. There is no incentive to get this data correct, or penalty for it being incorrect.

If there is a simple majority of the same vote being cast by proposers during each voting period of EPOCHS_PER_ETH1_VOTING_PERIOD epochs (6.8 hours) then the Eth1 data is committed to the beacon state. This updates the chain's view of the deposit contract, and new deposits since the last update will start being processed.

This mechanism has proved to be fragile in the past, but appears to be workable if not perfect.

Used by process_block()
See also Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD

Operations

def process_operations(state: BeaconState, body: BeaconBlockBody) -> None:
    # Verify that outstanding deposits are processed up to the maximum number of deposits
    assert len(body.deposits) == min(MAX_DEPOSITS, state.eth1_data.deposit_count - state.eth1_deposit_index)

    def for_ops(operations: Sequence[Any], fn: Callable[[BeaconState, Any], None]) -> None:
        for operation in operations:
            fn(state, operation)

    for_ops(body.proposer_slashings, process_proposer_slashing)
    for_ops(body.attester_slashings, process_attester_slashing)
    for_ops(body.attestations, process_attestation)
    for_ops(body.deposits, process_deposit)
    for_ops(body.voluntary_exits, process_voluntary_exit)
    for_ops(body.bls_to_execution_changes, process_bls_to_execution_change)  # [New in Capella]

Just a dispatcher for handling the various optional contents in a block.

Deposits are optional only in the sense that some blocks have them and some don't. However, as per the assert statement, if, according to the beacon chain's view of the Eth1 chain, there are deposits pending, then the block must include them, otherwise the block is invalid.

Regarding incentives for block proposers to include each of these elements:

  • Proposers are explicitly rewarded for including any available attestations and slashing reports.
  • There is a validity condition, and thus an implicit reward, related to including deposit messages.
  • The incentive for including voluntary exits is that a smaller validator set means higher rewards for the remaining validators.
  • There is no incentive, implicit or explicit, for including BLS withdrawal credential change messages. These are handled on a purely altruistic basis.
Used by process_block()
Uses process_proposer_slashing(), process_attester_slashing(), process_attestation(), process_deposit(), process_voluntary_exit(), process_bls_to_execution_change()
See also BeaconBlockBody
Proposer slashings

def process_proposer_slashing(state: BeaconState, proposer_slashing: ProposerSlashing) -> None:
    header_1 = proposer_slashing.signed_header_1.message
    header_2 = proposer_slashing.signed_header_2.message

    # Verify header slots match
    assert header_1.slot == header_2.slot
    # Verify header proposer indices match
    assert header_1.proposer_index == header_2.proposer_index
    # Verify the headers are different
    assert header_1 != header_2
    # Verify the proposer is slashable
    proposer = state.validators[header_1.proposer_index]
    assert is_slashable_validator(proposer, get_current_epoch(state))
    # Verify signatures
    for signed_header in (proposer_slashing.signed_header_1, proposer_slashing.signed_header_2):
        domain = get_domain(state, DOMAIN_BEACON_PROPOSER, compute_epoch_at_slot(signed_header.message.slot))
        signing_root = compute_signing_root(signed_header.message, domain)
        assert bls.Verify(proposer.pubkey, signing_root, signed_header.signature)

    slash_validator(state, header_1.proposer_index)

A ProposerSlashing is a proof that a proposer has signed two blocks at the same height. Up to MAX_PROPOSER_SLASHINGS of them may be included in a block. It contains the evidence in the form of a pair of SignedBeaconBlockHeaders.

The proof is simple: the two proposals come from the same slot, have the same proposer, but differ in one or more of parent_root, state_root, or body_root. In addition, they were both signed by the proposer. The conflicting blocks do not need to be valid: any pair of headers that meet the criteria, irrespective of the blocks' contents, are liable to be slashed.

As ever, the assert statements ensure that the containing block is invalid if it contains any invalid slashing claims.

Fun fact: the first slashing to occur on the beacon chain was a proposer slashing. Two clients running side-by-side with the same keys will often produce the same attestations since the protocol is designed to encourage that. Independently producing the same block is very unlikely as blocks contain much more data.

Used by process_block()
Uses is_slashable_validator(), get_domain(), compute_signing_root(), bls.Verify(), slash_validator()
See also ProposerSlashing
Attester slashings

def process_attester_slashing(state: BeaconState, attester_slashing: AttesterSlashing) -> None:
    attestation_1 = attester_slashing.attestation_1
    attestation_2 = attester_slashing.attestation_2
    assert is_slashable_attestation_data(attestation_1.data, attestation_2.data)
    assert is_valid_indexed_attestation(state, attestation_1)
    assert is_valid_indexed_attestation(state, attestation_2)

    slashed_any = False
    indices = set(attestation_1.attesting_indices).intersection(attestation_2.attesting_indices)
    for index in sorted(indices):
        if is_slashable_validator(state.validators[index], get_current_epoch(state)):
            slash_validator(state, index)
            slashed_any = True
    assert slashed_any

AttesterSlashings are similar to proposer slashings in that they just provide the evidence of the two aggregate IndexedAttestations that conflict with each other. Up to MAX_ATTESTER_SLASHINGS of them may be included in a block.

The validity checking is done by is_slashable_attestation_data(), which checks the double vote and surround vote conditions, and by is_valid_indexed_attestation() which verifies the signatures on the attestations.

Any validators that appear in both attestations are slashed. If no validator is slashed, then the attester slashing claim was not valid after all, and therefore its containing block is invalid.

Examples: a double vote attester slashing; surround vote attester slashings.

Used by process_block()
Uses is_slashable_attestation_data(), is_valid_indexed_attestation(), is_slashable_validator(), slash_validator()
See also AttesterSlashing
Attestations

def process_attestation(state: BeaconState, attestation: Attestation) -> None:
    data = attestation.data
    assert data.target.epoch in (get_previous_epoch(state), get_current_epoch(state))
    assert data.target.epoch == compute_epoch_at_slot(data.slot)
    assert data.slot + MIN_ATTESTATION_INCLUSION_DELAY <= state.slot <= data.slot + SLOTS_PER_EPOCH
    assert data.index < get_committee_count_per_slot(state, data.target.epoch)

    committee = get_beacon_committee(state, data.slot, data.index)
    assert len(attestation.aggregation_bits) == len(committee)

    # Participation flag indices
    participation_flag_indices = get_attestation_participation_flag_indices(state, data, state.slot - data.slot)

    # Verify signature
    assert is_valid_indexed_attestation(state, get_indexed_attestation(state, attestation))

    # Update epoch participation flags
    if data.target.epoch == get_current_epoch(state):
        epoch_participation = state.current_epoch_participation
    else:
        epoch_participation = state.previous_epoch_participation

    proposer_reward_numerator = 0
    for index in get_attesting_indices(state, data, attestation.aggregation_bits):
        for flag_index, weight in enumerate(PARTICIPATION_FLAG_WEIGHTS):
            if flag_index in participation_flag_indices and not has_flag(epoch_participation[index], flag_index):
                epoch_participation[index] = add_flag(epoch_participation[index], flag_index)
                proposer_reward_numerator += get_base_reward(state, index) * weight

    # Reward proposer
    proposer_reward_denominator = (WEIGHT_DENOMINATOR - PROPOSER_WEIGHT) * WEIGHT_DENOMINATOR // PROPOSER_WEIGHT
    proposer_reward = Gwei(proposer_reward_numerator // proposer_reward_denominator)
    increase_balance(state, get_beacon_proposer_index(state), proposer_reward)

Block proposers are rewarded here for including attestations during block processing, while attesting validators receive their rewards and penalties during epoch processing.

This routine processes each attestation included in the block. First a bunch of validity checks are performed. If any of these fails, then the whole block is invalid (it is most likely from a proposer on a different fork, and so useless to us):

  • The target vote of the attestation must be either the previous epoch's checkpoint or the current epoch's checkpoint.
  • The target checkpoint and the attestation's slot must belong to the same epoch.
  • The attestation must be no newer than MIN_ATTESTATION_INCLUSION_DELAY slots, which is one. So this condition rules out attestations from the current or future slots.
  • The attestation must be no older than SLOTS_PER_EPOCH slots, which is 32.16
  • The attestation must come from a committee that existed when the attestation was created.
  • The size of the committee and the size of the aggregate must match (aggregation_bits).
  • The (aggregate) signature on the attestation must be valid and must correspond to the aggregated public keys of the validators that it claims to be signed by. This (and other criteria) is checked by is_valid_indexed_attestation().

Once the attestation has passed the checks it is processed by converting the votes from validators that it contains into flags in the state.

It's easy to skip over amidst all the checking, but the actual attestation processing is done by get_attestation_participation_flag_indices(). This takes the source, target, and head votes of the attestation, along with its inclusion delay (how many slots late was it included in a block) and returns a list of up to three flags corresponding to the votes that were both correct and timely, in participation_flag_indices.

For each validator that signed the attestation, if each flag in participation_flag_indices is not already set for it in its epoch_participation record, then the flag is set, and the proposer is rewarded. Recall that the validator making the attestation is not rewarded until the end of the epoch. If the flag is already set in the corresponding epoch for a validator, no proposer reward is accumulated: the attestation for this validator was included in an earlier block.

The proposer reward is accumulated, and weighted according to the weight assigned to each of the flags (timely source, timely target, timely head).

If a proposer includes all the attestations only for one slot, and all the relevant validators vote, then its reward will be, in the notation established earlier,

IAP=Wp32(WΣWp)IAI_{A_P} = \frac{W_p}{32(W_{\Sigma} - W_p)}I_A

Where IAI_A is the total maximum reward per epoch for attesters, calculated in get_flag_index_deltas(). The total available reward in an epoch for proposers including attestations is 32 times this.

Used by process_operations()
Uses get_committee_count_per_slot(), get_beacon_committee(), get_attestation_participation_flag_indices(), is_valid_indexed_attestation(), get_indexed_attestation(), get_attesting_indices(), has_flag(), add_flag(), get_base_reward(), increase_balance()
See also Participation flag indices, PARTICIPATION_FLAG_WEIGHTS, get_flag_index_deltas()
Deposits

The code in this section handles deposit transactions that were included in a block. A deposit is created when a user transfers one or more ETH to the deposit contract. We need to check that the data sent with the deposit is valid. If it is, we either create a new validator record (for the first deposit for a validator) or update an existing record.

def get_validator_from_deposit(pubkey: BLSPubkey, withdrawal_credentials: Bytes32, amount: uint64) -> Validator:
    effective_balance = min(amount - amount % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)

    return Validator(
        pubkey=pubkey,
        withdrawal_credentials=withdrawal_credentials,
        activation_eligibility_epoch=FAR_FUTURE_EPOCH,
        activation_epoch=FAR_FUTURE_EPOCH,
        exit_epoch=FAR_FUTURE_EPOCH,
        withdrawable_epoch=FAR_FUTURE_EPOCH,
        effective_balance=effective_balance,
    )

Create a newly initialised validator object based on deposit data. This was factored out of process_deposit() for better code reuse between the Phase 0 spec and the (now deprecated) sharding spec.

The pubkey is supplied in the initial deposit transaction. The depositor generates the validator's public key from the its private key.

Used by apply_deposit()
See also Validator, FAR_FUTURE_EPOCH, EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE

def apply_deposit(state: BeaconState,
                  pubkey: BLSPubkey,
                  withdrawal_credentials: Bytes32,
                  amount: uint64,
                  signature: BLSSignature) -> None:
    validator_pubkeys = [validator.pubkey for validator in state.validators]
    if pubkey not in validator_pubkeys:
        # Verify the deposit signature (proof of possession) which is not checked by the deposit contract
        deposit_message = DepositMessage(
            pubkey=pubkey,
            withdrawal_credentials=withdrawal_credentials,
            amount=amount,
        )
        domain = compute_domain(DOMAIN_DEPOSIT)  # Fork-agnostic domain since deposits are valid across forks
        signing_root = compute_signing_root(deposit_message, domain)
        # Initialize validator if the deposit signature is valid
        if bls.Verify(pubkey, signing_root, signature):
            state.validators.append(get_validator_from_deposit(pubkey, withdrawal_credentials, amount))
            state.balances.append(amount)
            # [New in Altair]
            state.previous_epoch_participation.append(ParticipationFlags(0b0000_0000))
            state.current_epoch_participation.append(ParticipationFlags(0b0000_0000))
            state.inactivity_scores.append(uint64(0))
    else:
        # Increase balance by deposit amount
        index = ValidatorIndex(validator_pubkeys.index(pubkey))
        increase_balance(state, index, amount)

The apply_deposit() function was factored out of process_deposit() in the Capella release for better code reuse between the Phase 0 spec and the EIP-6110 spec17.

Deposits are signed with the private key of the depositing validator, and the corresponding public key is included in the deposit data. This constitutes a "proof of possession" of the private key, and prevents nastiness like the rogue key attack. Note that compute_domain() is used directly here when validating the deposit's signature, rather than the more usual get_domain() wrapper. This is because deposit messages are valid across beacon chain upgrades (such as Phase 0, Altair, and Bellatrix), so we don't want to mix the fork version into the domain. In addition, deposits can be made before genesis_validators_root is known.

An interesting quirk of this routine is that only the first deposit for a validator needs to be signed. Subsequent deposits for the same public key do not have their signatures checked. This could allow one staker (the key holder) to make an initial deposit (1 ETH, say), and for that to be topped up by others who do not have the private key. I don't know of any practical uses for this feature, but would be glad to hear of any. It slightly reduces the risk for stakers making multiple deposits for the same validator as they don't need to worry about incorrectly signing any but the first deposit.

Similarly, once a validator's withdrawal credentials have been set by the initial deposit transaction, the withdrawal credentials of subsequent deposits for the same validator are ignored. Only the credentials appearing on the initial deposit are stored on the beacon chain. This is an important security measure. If an attacker steals a validator's signing key (which signs deposit transactions), we don't want them to be able to change the withdrawal credentials in order to steal the stake for themselves. However, it works both ways, and a vulnerability was identified for staking pools in which a malicious operator could potentially front-run a deposit transaction with a 1 ETH deposit to set the withdrawal credentials to their own.

Note that the withdrawal_credential in the deposit data is not checked in any way. It's up to the depositor to ensure that they are using the correct prefix and contents to be able to receive their rewards and retrieve their stake back after exiting the consensus layer.

Used by process_deposit()
Uses compute_domain(), compute_signing_root(), bls.Verify(), get_validator_from_deposit()
See also DepositMessage, DOMAIN_DEPOSIT

def process_deposit(state: BeaconState, deposit: Deposit) -> None:
    # Verify the Merkle branch
    assert is_valid_merkle_branch(
        leaf=hash_tree_root(deposit.data),
        branch=deposit.proof,
        depth=DEPOSIT_CONTRACT_TREE_DEPTH + 1,  # Add 1 for the List length mix-in
        index=state.eth1_deposit_index,
        root=state.eth1_data.deposit_root,
    )

    # Deposits must be processed in order
    state.eth1_deposit_index += 1

    apply_deposit(
        state=state,
        pubkey=deposit.data.pubkey,
        withdrawal_credentials=deposit.data.withdrawal_credentials,
        amount=deposit.data.amount,
        signature=deposit.data.signature,
    )

Here, we process a deposit from a block. If the deposit is valid, either a new validator is created or the deposit amount is added to an existing validator.

The call to is_valid_merkle_branch() ensures that it is not possible to fake a deposit. The eth1data.deposit_root from the deposit contract has been agreed by the beacon chain and includes all pending deposits visible to the beacon chain. The deposit itself contains a Merkle proof that it is included in that root. The state.eth1_deposit_index counter ensures that deposits are processed in order. In short, the proposer provides leaf and branch, but neither index nor root.

If the Merkle branch check fails, then the whole block is invalid. However, individual deposits can fail the signature check without invalidating the block.

Deposits must be processed in order, and all available deposits must be included in the block (up to MAX_DEPOSITS - checked in process_operations()). This ensures that the beacon chain cannot censor deposit transactions, except at the expense of stopping block production entirely.

Used by process_operations()
Uses is_valid_merkle_branch(), hash_tree_root(), apply_deposit()
See also Deposit, DEPOSIT_CONTRACT_TREE_DEPTH
Voluntary exits

def process_voluntary_exit(state: BeaconState, signed_voluntary_exit: SignedVoluntaryExit) -> None:
    voluntary_exit = signed_voluntary_exit.message
    validator = state.validators[voluntary_exit.validator_index]
    # Verify the validator is active
    assert is_active_validator(validator, get_current_epoch(state))
    # Verify exit has not been initiated
    assert validator.exit_epoch == FAR_FUTURE_EPOCH
    # Exits must specify an epoch when they become valid; they are not valid before then
    assert get_current_epoch(state) >= voluntary_exit.epoch
    # Verify the validator has been active long enough
    assert get_current_epoch(state) >= validator.activation_epoch + SHARD_COMMITTEE_PERIOD
    # Verify signature
    domain = get_domain(state, DOMAIN_VOLUNTARY_EXIT, voluntary_exit.epoch)
    signing_root = compute_signing_root(voluntary_exit, domain)
    assert bls.Verify(validator.pubkey, signing_root, signed_voluntary_exit.signature)
    # Initiate exit
    initiate_validator_exit(state, voluntary_exit.validator_index)

A voluntary exit message is submitted by a validator to indicate that it wishes to cease being an active validator. A proposer receives voluntary exit messages via gossip or via its own API and then includes the message in a block so that it can be processed by the network.

Most of the checks are straightforward, as per the comments in the code. Note the following.

  • Voluntary exits are invalid if they are included in blocks before the given epoch, so nodes should buffer any future-dated exits they see before putting them in a block.
  • A validator must have been active for at least SHARD_COMMITTEE_PERIOD epochs (27 hours). See there for the rationale.
  • Voluntary exits are signed with the validator's usual signing key. There is some discussion about changing this to also allow signing of a voluntary exit with the validator's withdrawal key.

If the voluntary exit message is valid then the validator is added to the exit queue by calling initiate_validator_exit().

At present, it is not possible for a validator to exit and re-enter, but this functionality may be introduced in future.

Used by process_operations()
Uses is_active_validator(), get_domain(), compute_signing_root(), bls.Verify(), initiate_validator_exit()
See also VoluntaryExit, SHARD_COMMITTEE_PERIOD
process_bls_to_execution_change

def process_bls_to_execution_change(state: BeaconState,
                                    signed_address_change: SignedBLSToExecutionChange) -> None:
    address_change = signed_address_change.message

    assert address_change.validator_index < len(state.validators)

    validator = state.validators[address_change.validator_index]

    assert validator.withdrawal_credentials[:1] == BLS_WITHDRAWAL_PREFIX
    assert validator.withdrawal_credentials[1:] == hash(address_change.from_bls_pubkey)[1:]

    # Fork-agnostic domain since address changes are valid across forks
    domain = compute_domain(DOMAIN_BLS_TO_EXECUTION_CHANGE, genesis_validators_root=state.genesis_validators_root)
    signing_root = compute_signing_root(address_change, domain)
    assert bls.Verify(address_change.from_bls_pubkey, signing_root, signed_address_change.signature)

    validator.withdrawal_credentials = (
        ETH1_ADDRESS_WITHDRAWAL_PREFIX
        + b'\x00' * 11
        + address_change.to_execution_address
    )

The Capella upgrade provides a one-time operation to allow stakers to change their withdrawal credentials from BLS type (BLS_WITHDRAWAL_PREFIX), which do not allow withdrawals, to Eth1 style (ETH1_ADDRESS_WITHDRAWAL_PREFIX), which enable automatic withdrawals.

Stakers can make the change by signing a BLSToExecutionChange message and broadcasting it to the network. At some point a proposer will include the change message in a block and it will arrive at this function in the state transition.

For BLS credentials the withdrawal credential contains the last 31 bytes of the SHA256 hash of a public key. That public key is the validator's withdrawal key, distinct from its signing key, although often derived from the same mnemonic. By checking its hash, we are confirming that the public key provided in the change message is the same one that created the withdrawal credential in the initial deposit.

Once we are satisfied that the public key is the same on previously committed to, then we can use it to verify the signature on the withdrawal transaction. Again, this transaction must be signed with the validator's withdrawal private key, not its usual signing key.

Having verified the signature, we can finally, and irrevocably, update the validator's withdrawal credentials from BLS style to Eth1 style.

Used by process_operations()
Uses compute_signing_root(), compute_domain(), bls.Verify()
See also BLS_WITHDRAWAL_PREFIX, BLSToExecutionChange

Sync aggregate processing

def process_sync_aggregate(state: BeaconState, sync_aggregate: SyncAggregate) -> None:
    # Verify sync committee aggregate signature signing over the previous slot block root
    committee_pubkeys = state.current_sync_committee.pubkeys
    participant_pubkeys = [pubkey for pubkey, bit in zip(committee_pubkeys, sync_aggregate.sync_committee_bits) if bit]
    previous_slot = max(state.slot, Slot(1)) - Slot(1)
    domain = get_domain(state, DOMAIN_SYNC_COMMITTEE, compute_epoch_at_slot(previous_slot))
    signing_root = compute_signing_root(get_block_root_at_slot(state, previous_slot), domain)
    assert eth_fast_aggregate_verify(participant_pubkeys, signing_root, sync_aggregate.sync_committee_signature)

    # Compute participant and proposer rewards
    total_active_increments = get_total_active_balance(state) // EFFECTIVE_BALANCE_INCREMENT
    total_base_rewards = Gwei(get_base_reward_per_increment(state) * total_active_increments)
    max_participant_rewards = Gwei(total_base_rewards * SYNC_REWARD_WEIGHT // WEIGHT_DENOMINATOR // SLOTS_PER_EPOCH)
    participant_reward = Gwei(max_participant_rewards // SYNC_COMMITTEE_SIZE)
    proposer_reward = Gwei(participant_reward * PROPOSER_WEIGHT // (WEIGHT_DENOMINATOR - PROPOSER_WEIGHT))

    # Apply participant and proposer rewards
    all_pubkeys = [v.pubkey for v in state.validators]
    committee_indices = [ValidatorIndex(all_pubkeys.index(pubkey)) for pubkey in state.current_sync_committee.pubkeys]
    for participant_index, participation_bit in zip(committee_indices, sync_aggregate.sync_committee_bits):
        if participation_bit:
            increase_balance(state, participant_index, participant_reward)
            increase_balance(state, get_beacon_proposer_index(state), proposer_reward)
        else:
            decrease_balance(state, participant_index, participant_reward)

Similarly to how attestations are handled, the beacon block proposer includes in its block an aggregation of sync committee votes that agree with its local view of the chain. Specifically, the sync committee votes are for the head block that the proposer saw in the previous slot. (If the previous slot is empty, then the head block will be from an earlier slot.)

We validate these votes against our local view of the chain, and if they agree then we reward the participants that voted. If they do not agree with our local view, then the entire block is invalid: it is on another branch.

To perform the validation, we form the signing root of the block at the previous slot, with DOMAIN_SYNC_COMMITTEE mixed in. Then we check if the aggregate signature received in the SyncAggregate verifies against it, using the aggregate public key of the validators who claimed to have signed it. If either the signing root (that is, the head block) is wrong, or the list of participants is wrong, then the verification will fail and the block is invalid.

Like proposer rewards, but unlike attestation rewards, sync committee rewards are not weighted with the participants' effective balances. This is already taken care of by the committee selection process that weights the probability of selection with the effective balance of the validator.

Running through the calculations:

  • total_active_increments: the sum of the effective balances of the entire active validator set normalised with the EFFECTIVE_BALANCE_INCREMENT to give the total number of increments.
  • total_base_rewards: the maximum rewards that will be awarded to all validators for all duties this epoch. It is at most NBNB in the notation established earlier.
  • max_participant_rewards: the amount of the total reward to be given to the entire sync committee in this slot.
  • participant_reward: the reward per participating validator, and the penalty per non-participating validator.
  • proposer_reward: one seventh of the participant reward.

Each committee member that voted receives a reward of participant_reward, and the proposer receives one seventh of this in addition.

Each committee member that failed to vote receives a penalty of participant_reward, and the proposer receives nothing.

In our notation the maximum issuance (reward) due to sync committees per slot is as follows.

IS=Wy32WΣNBI_S = \frac{W_y}{32 \cdot W_{\Sigma}}NB

The per-epoch reward is thirty-two times this. The maximum reward for the proposer in respect of sync aggregates:

ISP=WpWΣWpISI_{S_P} = \frac{W_p}{W_{\Sigma} - W_p}I_S
Used by process_operations()
Uses get_domain(), compute_signing_root(), eth_fast_aggregate_verify(), get_total_active_balance(), get_base_reward_per_increment(), increase_balance(), decrease_balance()
See also Incentivization weights, SYNC_COMMITTEE_SIZE

Initialise State

Introduction

TODO: rework and synthesis - this text is from the original Genesis.

Before the Ethereum beacon chain genesis has been triggered, and for every Ethereum proof-of-work block, let candidate_state = initialize_beacon_state_from_eth1(eth1_block_hash, eth1_timestamp, deposits) where:

  • eth1_block_hash is the hash of the Ethereum proof-of-work block
  • eth1_timestamp is the Unix timestamp corresponding to eth1_block_hash
  • deposits is the sequence of all deposits, ordered chronologically, up to (and including) the block with hash eth1_block_hash

Proof of work blocks must only be considered once they are at least SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE seconds old (i.e. eth1_timestamp + SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE <= current_unix_time). Due to this constraint, if GENESIS_DELAY < SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE, then the genesis_time can happen before the time/state is first known. Values should be configured to avoid this case.

Initialisation

Aka genesis.

This helper function is only for initializing the state for pure Capella testnets and tests.

def initialize_beacon_state_from_eth1(eth1_block_hash: Hash32,
                                      eth1_timestamp: uint64,
                                      deposits: Sequence[Deposit],
                                      execution_payload_header: ExecutionPayloadHeader=ExecutionPayloadHeader()
                                      ) -> BeaconState:
    fork = Fork(
        previous_version=CAPELLA_FORK_VERSION,  # [Modified in Capella] for testing only
        current_version=CAPELLA_FORK_VERSION,  # [Modified in Capella]
        epoch=GENESIS_EPOCH,
    )
    state = BeaconState(
        genesis_time=eth1_timestamp + GENESIS_DELAY,
        fork=fork,
        eth1_data=Eth1Data(block_hash=eth1_block_hash, deposit_count=uint64(len(deposits))),
        latest_block_header=BeaconBlockHeader(body_root=hash_tree_root(BeaconBlockBody())),
        randao_mixes=[eth1_block_hash] * EPOCHS_PER_HISTORICAL_VECTOR,  # Seed RANDAO with Eth1 entropy
    )

    # Process deposits
    leaves = list(map(lambda deposit: deposit.data, deposits))
    for index, deposit in enumerate(deposits):
        deposit_data_list = List[DepositData, 2**DEPOSIT_CONTRACT_TREE_DEPTH](*leaves[:index + 1])
        state.eth1_data.deposit_root = hash_tree_root(deposit_data_list)
        process_deposit(state, deposit)

    # Process activations
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)
        if validator.effective_balance == MAX_EFFECTIVE_BALANCE:
            validator.activation_eligibility_epoch = GENESIS_EPOCH
            validator.activation_epoch = GENESIS_EPOCH

    # Set genesis validators root for domain separation and chain versioning
    state.genesis_validators_root = hash_tree_root(state.validators)

    # Fill in sync committees
    # Note: A duplicate committee is assigned for the current and next committee at genesis
    state.current_sync_committee = get_next_sync_committee(state)
    state.next_sync_committee = get_next_sync_committee(state)

    # Initialize the execution payload header
    state.latest_execution_payload_header = execution_payload_header

    return state

Each state fields start with its default SSZ value unless a value is explicitly provided. So, for example, state.next_withdrawal_index will be initialised to zero, and state.historical_summaries to an empty list.

Genesis state

Let genesis_state = candidate_state whenever is_valid_genesis_state(candidate_state) is True for the first time.

def is_valid_genesis_state(state: BeaconState) -> bool:
    if state.genesis_time < MIN_GENESIS_TIME:
        return False
    if len(get_active_validator_indices(state, GENESIS_EPOCH)) < MIN_GENESIS_ACTIVE_VALIDATOR_COUNT:
        return False
    return True

TODO

Genesis block

Let genesis_block = BeaconBlock(state_root=hash_tree_root(genesis_state)).

TODO

Fork Choice

Introduction

The beacon chain's fork choice is documented separately from the main state transition specification. Like the main specification, the fork choice spec is incremental, with later versions specifying only the changes since the previous version. When annotating the main spec I combined the incremental versions into a single up-to-date document. In the following, however, I will deal separately with the original Phase 0 fork choice and the incremental Bellatrix fork choice update as the latter mainly introduced one-off functionality specific to the Merge transition.

What's a fork choice?

As described in the introduction to consensus, a fork choice rule is the means by which a node decides, given the information available to it, which block is the "best" head of the chain. A good fork choice rule results in the network of nodes eventually converging on the same canonical chain: it is able to resolve forks consistently, even under a degree of faulty or adversarial behaviour.

Ethereum's proof of stake consensus introduces a Store object that contains all the data necessary for determining a best head. A node's Store is the "source of truth" for its fork choice rule. In classical consensus terms it is a node's local view: all the relevant information that a node has about the network state. The fork choice rule can be characterised as a function, GetHead(Store)HeadBlock\text{GetHead}(\text{Store}) \rightarrow \text{HeadBlock}.

During the Merge event, the beacon chain's fork choice was temporarily augmented to be able to consider blocks on the Eth1 chain, in order to agree which (of potentially multiple candidates) would become the terminal proof of work block.

Overview

Ethereum's fork choice comprises the LMD GHOST fork choice rule, modified by (constrained by) the Casper FFG fork choice rule. The Casper FFG rule modifies the LMD GHOST fork choice by only allowing blocks descended from the last finalised18 checkpoint to be candidates for the chain head. All earlier branches are effectively pruned out of a node's local view of the network state.

Diagram of a block tree showing that Casper FFG finalises the early chain up to a checkpoint and LMD GHOST handles fork choice after that.

Casper FFG's role is to finalise a checkpoint. History prior to the finalised checkpoint is a linear chain of blocks with all branches pruned away. LMD GHOST is used to select the best head block at any time. LMD GHOST is constrained by Casper FFG in that it operates on the block tree only after the finalised checkpoint.

This combination has come to be known as "Gasper", and appears to be relatively simple at first sight. However, the emergence of various edge cases, and a relentless stream of potential attacks has led third party researchers to declare that "The Gasper protocol is complex". And that remark was made before implementing many of the fixes that we'll be reviewing in the following sections. Vitalik himself has written that

The "interface" between Casper FFG finalization and LMD GHOST fork choice is a source of significant complexity, leading to a number of attacks that have required fairly complicated patches to fix, with more weaknesses being regularly discovered.

Despite all this, we are happily running Ethereum on top of the Gasper protocol today. We continue to incrementally add defences against known attacks, and one day we may move on from Gasper entirely - perhaps to a single slot finality protocol, or to Casper CBC. Meanwhile, Gasper is proving to be "good enough" in practice.19

Scope and terminology

These fork choice specification documents don't cover the whole mechanism. They are largely concerned only with the LMD GHOST fork choice; the Casper FFG side of things (justification and finalisation) is dealt with in the main state-transition specification.

The terms attestation, vote, and message appear frequently. An attestation is a collection of three votes: a vote for a source checkpoint, a vote for a target checkpoint, and a vote for a head block. The source and target votes are used by Casper FFG, and the head vote is used by LMD GHOST. We will mostly be concerned with head votes in the following sections, except when stated otherwise. LMD GHOST head votes are also called messages, being the "M" in "LMD".

Where we discuss attestations, they can be a single attestation from one validator, or aggregate attestations containing the attestations of multiple validators that made the same set of votes. It will be clear from the context which of these applies.

Decoding dev-speak

Sometimes you'll hear protocol devs say slightly obscure things like, "we can deal with that in fork choice". For example, "we can handle censorship via the fork choice".

This framing makes sense when we understand that a node's fork choice rule is its expression of which chain it prefers to follow, or prefers not to follow. No honest node wants to follow a chain that contains invalid blocks (according to the state transition), so the fork choice of all honest nodes will never select a head block that has an invalid block in its ancestry.

Similarly, nodes could modify their fork choice rule so that branches with blocks that appear to censor transactions are never selected. If nodes with sufficient validators do this, then any such block will be orphaned, strongly discouraging censorship. This works both ways, of course. A government could declare that the fork choice must ignore any branches with blocks that do not censor transactions. If enough validators – over half – choose to comply, then the whole chain will become censoring.

The goal of the fork choice is for the network to converge onto a single history, so there is a strong incentive to try to agree with one's peers. However, it also provides a mechanism that can be used (perhaps as an outcome of social coordination) to be opinionated about what kind of blocks are eventually included in that history.

History

Proof of stake Ethereum has a long history that we shall review elsewhere. The following milestones are significant for the current Casper FFG plus LMD GHOST implementation.

Vitalik published the original mini-spec for the beacon chain's proof of stake consensus on July 31st 2018, shortly after we had abandoned prior designs for moving Ethereum to PoS. The initial design used IMD GHOST (Immediate Message Driven GHOST) in which attestations have a limited lifetime in the fork choice20. IMD GHOST was changed to LMD GHOST (Latest Message Driven GHOST) in November 2018 due to concerns about the stability property of IMD.

The initial fork choice spec was published to GitHub in April 2019, numbering a mere 96 lines. The current Phase 0 fork choice spec has 576 lines.

Various issues have caused the fork choice specification to balloon in complexity.

In August 2019, a "decoy flip-flop attack" on LMD GHOST was identified that could be used by an adversary to delay finalisation (for a limited period of time). The defence against this is to add a check that newly considered attestations are from either the current or previous epoch only. We'll cover this under validate_on_attestation().

In September 2019 a "bouncing attack" on Casper FFG was identified that can delay finalisation indefinitely. Up to the Capella spec release we had a fix for this that only allowed the fork choice's justified checkpoint to be updated during the early part of an epoch. The fix was removed in the Capella upgrade since it adds significant complexity to the fork choice, and in any case can be worked around by splitting honest validators' views. The bouncing attack is very difficult to set up and an adversary with the power to do this could probably attack the chain in more interesting ways. The bouncing attack and its original fix remain documented in the Bellatrix edition.

In July 2021, an edge case was identified in which (if 1/3 of validators were prepared to be slashed) the invariant that the store's justified checkpoint must be a descendant of the finalised checkpoint could become violated. A fix to the on_tick() handler was implemented to maintain the invariant.

In November 2021, some overly complicated logic was identified in the on_block() handler that could lead to the Store retaining inconsistent finalised and justified checkpoints, which would in turn cause filter_block_tree() to fail. Over one third of validators would have had to be slashed to trigger the fault, but the resulting fix turned out to be a nice simplification in any case.

Proposer boost was also added in November 2021. This is a defence against potential balancing attacks on LMD GHOST that could prevent Casper FFG from finalising. We'll cover this in detail in the proposer boost section.

A new type of balancing attack was published in January 2022 that relies on the attacker's validators making equivocating attestations (multiple different attestations at the same slot). To counter this, a defence against equivocating indices was added in March 2022. We'll discuss this when we get to the on_attester_slashing() handler. This defence was bolstered in the Capella spec update by excluding all slashed validators from having an influence in the fork choice.

Several issues involving "unrealised justfication" were discovered during the first half of 2022. First, an unrealised justification reorg attack that allowed the proposer of the first block of an epoch to easily fork out up to nine blocks from the end of the previous epoch. A variant of that attack was also found to be able to cause validators to make slashable attestations. Second, a justification withholding attack that an adversary could use to reorg arbitrary numbers of blocks at the start of an epoch. These issues were addressed in the Capella spec update with the "pull up tips" and unrealised justification logic that it introduced.

A reader might infer from this catalogue of issues that the fork choice is fiendishly difficult to reason about, and the reader would not be wrong. Some long-overdue formal verification work on the fork choice rule has recently been completed. It seeks to prove certain desirable properties, such as that an honest validator following the rules can never make slashable attestations.

We will study each of the issues above in more detail as we work through the fork choice specification in the following two sections.

Note that the Capella upgrade included a substantial rewrite of the fork choice specification. The rewrite removed the bouncing attack fix and introduced the "pull up tips" defence against a new attack, among other things. The following sections are based on the updated Capella version, but the previous annotated fork choice remains available. All of these changes were quietly rolled out prior to Capella, buried within various client software updates, while the updated spec was held back until the Capella upgrade itself21. A public disclosure of the issues was made a few weeks after the Capella upgrade.

Phase 0 Fork Choice

This section covers the Phase 0 Fork Choice document. It is based on the Capella, v1.3.0, spec release version. For an alternative take, I recommend Vitalik's annotated fork choice document.

Block-quoted content below (with a sidebar) has been copied over verbatim from the specs repo, as has all the function code.

The head block root associated with a store is defined as get_head(store). At genesis, let store = get_forkchoice_store(genesis_state, genesis_block) and update store by running:

  • on_tick(store, time) whenever time > store.time where time is the current Unix time
  • on_block(store, block) whenever a block block: SignedBeaconBlock is received
  • on_attestation(store, attestation) whenever an attestation attestation is received
  • on_attester_slashing(store, attester_slashing) whenever an attester slashing attester_slashing is received

Any of the above handlers that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid. Invalid calls to handlers must not modify store.

Updates to the Store arise only through the four handler functions: on_tick(), on_block(), on_attestation(), and on_attester_slashing(). These are the four senses through which the fork choice gains its knowledge of the world.

Notes:

  1. Leap seconds: Slots will last SECONDS_PER_SLOT + 1 or SECONDS_PER_SLOT - 1 seconds around leap seconds. This is automatically handled by UNIX time.

Leap seconds will no longer occur after 2035. We can remove this note after that.

  1. Honest clocks: Honest nodes are assumed to have clocks synchronized within SECONDS_PER_SLOT seconds of each other.

In practice, the synchrony assumptions are stronger than this. Any node whose clock is more than SECONDS_PER_SLOT / INTERVALS_PER_SLOT (four seconds) adrift will suffer degraded performance and can be considered Byzantine (faulty), at least for the LMD GHOST fork choice.

  1. Eth1 data: The large ETH1_FOLLOW_DISTANCE specified in the honest validator document should ensure that state.latest_eth1_data of the canonical beacon chain remains consistent with the canonical Ethereum proof-of-work chain. If not, emergency manual intervention will be required.

Post-Merge, consistency between the execution and consensus layers is no longer an issue, although we retain the ETH1_FOLLOW_DISTANCE for now.

  1. Manual forks: Manual forks may arbitrarily change the fork choice rule but are expected to be enacted at epoch transitions, with the fork details reflected in state.fork.

Manual forks are sometimes called hard forks or upgrades, and are planned in advance and coordinated. They are different from the inadvertent forks that the fork choice rule is designed to resolve.

  1. Implementation: The implementation found in this specification is constructed for ease of understanding rather than for optimization in computation, space, or any other resource. A number of optimized alternatives can be found here.

After reading the spec you may be puzzled by the "ease of understanding" claim. However, it is certainly true that several of the algorithms are far from efficient, and a great deal of optimisation is needed for practical implementations.

Constant

Name Value
INTERVALS_PER_SLOT uint64(3)

Only blocks that arrive during the first 1 / INTERVALS_PER_SLOT of a slot's duration are eligible to have the proposer score boost added. This moment is the point in the slot at which validators are expected to publish attestations declaring their view of the head of the chain.

In the Ethereum consensus specification INTERVALS_PER_SLOT neatly divides SECONDS_PER_SLOT, and all time quantities are strictly uint64 numbers of seconds. However, other chains that run the same basic protocol as Ethereum might not have this property. For example, the Gnosis Beacon Chain has five-second slots. We changed Teku's internal clock from seconds to milliseconds to support this, which is technically off-spec, but nothing broke.

Configuration

Name Value
PROPOSER_SCORE_BOOST uint64(40)
  • The proposer score boost is worth PROPOSER_SCORE_BOOST percentage of the committee's weight, i.e., for slot with committee weight committee_weight the boost weight is equal to (committee_weight * PROPOSER_SCORE_BOOST) // 100.

Proposer boost is a modification to the fork choice rule that defends against a so-called balancing attack. When a timely block proposal is received, proposer boost temporarily adds a huge weight to that block's branch in the fork choice calculation, namely PROPOSER_SCORE_BOOST percent of the total effective balances of all the validators assigned to attest in that slot.

The value of PROPOSER_SCORE_BOOST has changed over time as the balancing attack has been analysed more thoroughly.

The basic trade-off in choosing a value for PROPOSER_SCORE_BOOST is between allowing an adversary to perform "ex-ante" or "ex-post" reorgs. Setting PROPOSER_SCORE_BOOST too high makes it easier for an adversarial proposer to perform ex-post reorgs - it gives the proposer disproportionate power compared with the votes of validators. Setting PROPOSER_SCORE_BOOST too low makes it easier for an adversary to perform ex-ante reorgs. Caspar Schwarz-Schilling covers these trade-offs nicely in his Liscon talk, The game of reorgs in PoS Ethereum.22

Helpers

LatestMessage

class LatestMessage(object):
    epoch: Epoch
    root: Root

This is just a convenience class for tracking the most recent head vote from each validator - the "LM" (latest message) in LMD GHOST. Epoch is a uint64 type, and Root is a Bytes32 type. The Store holds a mapping of validator indices to their latest messages.

Store

The Store is responsible for tracking information required for the fork choice algorithm. The important fields being tracked are described below:

  • justified_checkpoint: the justified checkpoint used as the starting point for the LMD GHOST fork choice algorithm.
  • finalized_checkpoint: the highest known finalized checkpoint. The fork choice only considers blocks that are not conflicting with this checkpoint.
  • unrealized_justified_checkpoint & unrealized_finalized_checkpoint: these track the highest justified & finalized checkpoints resp., without regard to whether on-chain realization has occurred, i.e. FFG processing of new attestations within the state transition function. This is an important distinction from justified_checkpoint & finalized_checkpoint, because they will only track the checkpoints that are realized on-chain. Note that on-chain processing of FFG information only happens at epoch boundaries.
  • unrealized_justifications: stores a map of block root to the unrealized justified checkpoint observed in that block.

These explanatory points were added in the Capella upgrade23. We will expand on them below in the appropriate places.

class Store(object):
    time: uint64
    genesis_time: uint64
    justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint
    unrealized_justified_checkpoint: Checkpoint
    unrealized_finalized_checkpoint: Checkpoint
    proposer_boost_root: Root
    equivocating_indices: Set[ValidatorIndex]
    blocks: Dict[Root, BeaconBlock] = field(default_factory=dict)
    block_states: Dict[Root, BeaconState] = field(default_factory=dict)
    checkpoint_states: Dict[Checkpoint, BeaconState] = field(default_factory=dict)
    latest_messages: Dict[ValidatorIndex, LatestMessage] = field(default_factory=dict)
    unrealized_justifications: Dict[Root, Checkpoint] = field(default_factory=dict)

A node's Store records all the fork choice related information that it has about the outside world. In more classical terms, the Store is the node's view of the network. The Store is updated only by the four handler functions.

The basic fields are as follows.

  • time: The wall-clock time (Unix time) of the last call to the on_tick() handler. In theory this is update continuously; in practice only at least two or three times per slot.
  • justified_checkpoint: Our node's view of the currently justified checkpoint.
  • finalized_checkpoint: Our node's view of the currently finalised checkpoint.
  • blocks: All the blocks that we know about that are descended from the finalized_checkpoint. The fork choice spec does not describe how to prune the Store, so we would end up with all blocks since genesis if we were to follow it precisely. However, only blocks descended from the last finalised checkpoint are ever considered in the fork choice, and the finalised checkpoint only increases in height. So it is safe for client implementations to remove from the Store all blocks (and their associated states) belonging to branches not descending from the last finalised checkpoint.
  • block_states: For every block in the Store, we also keep its corresponding (post-)state. These states are mostly used for information about justification and finalisation.
  • checkpoint_states: If there are empty slots immediately before a checkpoint then the checkpoint state will not correspond to a block state, so we store checkpoint states as well, indexed by Checkpoint rather than block root. The state at the last justified checkpoint is used for validator balances, and for validating attestations in the on_attester_slashing() handler.
  • latest_messages: The set of latest head votes from validators. When the on_attestation() handler processes a new head vote for a validator, it gets added to this set and the old vote is discarded.

The following fields were added at various times as new attacks and defences were found.

  • proposer_boost_root was added when proposer boost was implemented as a defence against the LMD balancing attack. It is set to the root of the current block for the duration of a slot, as long as that block arrived within the first third of a slot.
  • The equivocating_indices set was added to defend against the equivocation balancing attack. It contains the indices of any validators reported as having committed an attester slashing violation. These validators must be removed from consideration in the fork choice rule until the last justified checkpoint state catches up with the fact that the validators have been slashed.
  • The unrealized_justified_checkpoint and unrealized_finalized_checkpointfields were added in the Capella update. They are used to avoid certain problems with unrealised justification that the old version of filter_block_tree() suffered.
  • Also added in the Capella update was unrealized_justifications, which is a map of block roots to unrealised justification checkpoints. It is maintained by compute_pulled_up_tip(). For every block, it stores the justified checkpoint that results from running process_justification_and_finalization() on the block's post-state. In the beacon state, that calculation is done only on epoch boundaries, so, within the fork choice, we call the result "unrealised".

For non-Pythonistas, Set and Dict are Python generic types. A Set is an unordered collection of objects; a Dict provides key–value look-up.

is_previous_epoch_justified

def is_previous_epoch_justified(store: Store) -> bool:
    current_slot = get_current_slot(store)
    current_epoch = compute_epoch_at_slot(current_slot)
    return store.justified_checkpoint.epoch + 1 == current_epoch

Based on the current time in the Store, this function returns True if the checkpoint at the start of the previous epoch has been justified - that is, has received a super-majority Casper FFG vote.

Used by filter_block_tree()

get_forkchoice_store

The provided anchor-state will be regarded as a trusted state, to not roll back beyond. This should be the genesis state for a full client.

Note With regards to fork choice, block headers are interchangeable with blocks. The spec is likely to move to headers for reduced overhead in test vectors and better encapsulation. Full implementations store blocks as part of their database and will often use full blocks when dealing with production fork choice.

def get_forkchoice_store(anchor_state: BeaconState, anchor_block: BeaconBlock) -> Store:
    assert anchor_block.state_root == hash_tree_root(anchor_state)
    anchor_root = hash_tree_root(anchor_block)
    anchor_epoch = get_current_epoch(anchor_state)
    justified_checkpoint = Checkpoint(epoch=anchor_epoch, root=anchor_root)
    finalized_checkpoint = Checkpoint(epoch=anchor_epoch, root=anchor_root)
    proposer_boost_root = Root()
    return Store(
        time=uint64(anchor_state.genesis_time + SECONDS_PER_SLOT * anchor_state.slot),
        genesis_time=anchor_state.genesis_time,
        justified_checkpoint=justified_checkpoint,
        finalized_checkpoint=finalized_checkpoint,
        unrealized_justified_checkpoint=justified_checkpoint,
        unrealized_finalized_checkpoint=finalized_checkpoint,
        proposer_boost_root=proposer_boost_root,
        equivocating_indices=set(),
        blocks={anchor_root: copy(anchor_block)},
        block_states={anchor_root: copy(anchor_state)},
        checkpoint_states={justified_checkpoint: copy(anchor_state)},
        unrealized_justifications={anchor_root: justified_checkpoint}
    )

get_forkchoice_store() initialises the fork choice Store object from an anchor state and its corresponding block (header). As noted, the anchor state could be the genesis state. Equally, when using a checkpoint sync, the anchor state will be the finalised checkpoint state provided by the node operator, which will be treated as if it is a genesis state. In either case, the latest_messages store will be empty to begin with.

get_slots_since_genesis

def get_slots_since_genesis(store: Store) -> int:
    return (store.time - store.genesis_time) // SECONDS_PER_SLOT

Self explanatory. This one of only two places that store.time is used, the other being in the proposer boost logic in the on_block() handler.

Used by get_current_slot()

get_current_slot

def get_current_slot(store: Store) -> Slot:
    return Slot(GENESIS_SLOT + get_slots_since_genesis(store))

Self explanatory. GENESIS_SLOT is usually zero.

Used by get_voting_source(), filter_block_tree(), compute_pulled_up_tip(), on_tick_per_slot, validate_target_epoch_against_current_time(), validate_on_attestation(), on_tick(), on_block()
Uses get_slots_since_genesis()

compute_slots_since_epoch_start

def compute_slots_since_epoch_start(slot: Slot) -> int:
    return slot - compute_start_slot_at_epoch(compute_epoch_at_slot(slot))

Self explanatory.

Used by on_tick_per_slot()
Uses compute_epoch_at_slot(), compute_start_slot_at_epoch()

get_ancestor

def get_ancestor(store: Store, root: Root, slot: Slot) -> Root:
    block = store.blocks[root]
    if block.slot > slot:
        return get_ancestor(store, block.parent_root, slot)
    return root

Given a block root root, get_ancestor() returns the ancestor block (on the same branch) that was published at slot slot. If there was no block published at slot, then the ancestor block most recently published prior to slot is returned.

This function is sometimes used just to confirm that the block with root root is descended from a particular block at slot slot, and sometimes used actually to retrieve that ancestor block's root.

Uses get_ancestor() (recursively)
Used by get_weight(), filter_block_tree(), validate_on_attestation(), on_block(), get_ancestor() (recursively)

get_weight

def get_weight(store: Store, root: Root) -> Gwei:
    state = store.checkpoint_states[store.justified_checkpoint]
    unslashed_and_active_indices = [
        i for i in get_active_validator_indices(state, get_current_epoch(state))
        if not state.validators[i].slashed
    ]
    attestation_score = Gwei(sum(
        state.validators[i].effective_balance for i in unslashed_and_active_indices
        if (i in store.latest_messages
            and i not in store.equivocating_indices
            and get_ancestor(store, store.latest_messages[i].root, store.blocks[root].slot) == root)
    ))
    if store.proposer_boost_root == Root():
        # Return only attestation score if ``proposer_boost_root`` is not set
        return attestation_score

    # Calculate proposer score if ``proposer_boost_root`` is set
    proposer_score = Gwei(0)
    # Boost is applied if ``root`` is an ancestor of ``proposer_boost_root``
    if get_ancestor(store, store.proposer_boost_root, store.blocks[root].slot) == root:
        committee_weight = get_total_active_balance(state) // SLOTS_PER_EPOCH
        proposer_score = (committee_weight * PROPOSER_SCORE_BOOST) // 100
    return attestation_score + proposer_score

Here we find the essence of the GHOST24 protocol: the weight of a block is the sum of the votes for that block, plus the votes for all of its descendant blocks. We include votes for descendants when calculating a block's weight because a vote for a block is an implicit vote for all of that block's ancestors as well - if a particular block gets included on chain, all its ancestors must also be included. To put it another way, we treat validators as voting for entire branches rather than just their leaves.

Ignoring the proposer boost part for the time being, the main calculation being performed is as follows.

    state = store.checkpoint_states[store.justified_checkpoint]
    unslashed_and_active_indices = [
        i for i in get_active_validator_indices(state, get_current_epoch(state))
        if not state.validators[i].slashed
    ]
    attestation_score = Gwei(sum(
        state.validators[i].effective_balance for i in unslashed_and_active_indices
        if (i in store.latest_messages
            and i not in store.equivocating_indices
            and get_ancestor(store, store.latest_messages[i].root, store.blocks[root].slot) == root)
    ))

We only consider the votes of active and unslashed validators. (Slashed validators might still be in the exit queue and are technically "active", at least according to is_active_validator().) The exclusion of validators that have been slashed in-protocol at the last justified checkpoint was added in the Capella specification to complement the on_attester_slashing() handler. It will additionally exclude validators slashed via proposer slashings, and validators slashed long ago (when the exit queue is long) and for which we have discarded the attester slashing from the Store.

Given a block root, root, this adds up all the votes for blocks that are descended from that block. More precisely, it calculates the sum of the effective balances of all validators whose latest head vote was for a descendant of root or for root itself. It's the fact that we're basing our weight calculations only on each validator's latest vote that makes this "LMD" (latest message drive) GHOST.

Diagram of a block tree with weights and latest attesting balances shown for each block.

BNB_N is the sum of the effective balances of the validators whose most recent head vote was for block NN, and WNW_N is the weight of the branch starting at block NN.

Some obvious relationships apply between the weights, WxW_x, of blocks, and BxB_x, the latest attesting balances of blocks.

  • For a leaf block NN (a block with no children), WN=BNW_N = B_N.
  • The weight of a block is its own latest attesting balance plus the sum of the weights of its direct children. So, in the diagram, W1=B1+W2+W3W_1 = B_1 + W_2 + W_3.

These relationships can be used to avoid repeating lots of work by memoising the results.

Proposer boost

In September 2020, shortly before mainnet genesis, a theoretical "balancing attack" on the LMD GHOST consensus mechanism was published, with an accompanying Ethresear.ch post.

The balancing attack allows a very small number of validators controlled by an adversary to perpetually maintain a forked network, with half of all validators following one fork and half the other. This would delay finalisation indefinitely, which is a kind of liveness failure. Since the attack relies on some unrealistic assumptions about the power an adversary has over the network – namely, fine-grained control over who can see what and when – we felt that the potential attack was not a significant threat to the launch of the beacon chain. Later refinements to the attack appear to have made it more practical to execute, however.

A modification to the fork choice to mitigate the balancing attack was first suggested by Vitalik. This became known as proposer boost, and a version of it was adopted into the consensus layer specification in late 2021 with the various client teams releasing versions with mainnet support for proposer boost in April and May 2022.

Changes to the fork choice can be made outside major protocol upgrades; it is not strictly necessary for all client implementations to make the change simultaneously, as they must for hard-fork upgrades. Given this, mainnet client releases supporting proposer boost were made at various times in April and May 2022, and users were not forced to upgrade on a fixed schedule. Unfortunately, having a mix of nodes on the network, around half applying proposer boost and half not, led to a seven block reorganisation of the beacon chain on May 25, 2022. As a result, subsequent updates to the fork choice have tended to be more tightly coordinated between client teams.

Proposer boost details

Proposer boost modifies our nice, intuitive calculation of a branch's weight, based only on latest votes, by adding additional weight to a block that was received on time in the current slot. In this way, it introduces a kind of synchrony weighting. Vitalik calls this "an explicit 'synchronization bottleneck' gadget". In short, it treats a timely block as being a vote with a massive weight that is temporarily added to the branch that it is extending.

The simple intuition behind proposer boost is summarised by Barnabé Monnot as, "a block that is timely shouldn’t expect to be re-orged". In respect of the balancing attack, proposer boost is designed to overwhelm the votes from validators controlled by the adversary and instead allow the proposer of the timely block to choose the fork that will win. Quoting Francesco D'Amato, "the general strategy is to empower honest proposers to impose their view of the fork-choice, but without giving them too much power and making committees irrelevant".

The default setting for store.proposer_boost_root is Root(). That is, the "empty" or "null" default SSZ root value, with all bytes set to zero. Whenever a block is received during the first 1 / INTERVALS_PER_SLOT portion of a slot – that is, when the block is timely – store.proposer_boost_root is set to the hash tree root of that block by the on_block() handler. At the end of each slot it is reset to Root() by the on_tick() handler. Thus, proposer boost has an effect on the fork choice calculation from the point at which a timely block is received until the end of that slot, where "timely" on Ethereum's beacon chain means "within the first four seconds".

Proposer boost causes entire branches to be favoured when the block at their tip is timely. When proposer boost is in effect, and the timely block in the current slot (which has root, store.proposer_boost_root) is descended from the block we are calculating the weight for, then that block's weight is also increased, since the calculation includes the weights of all its descendants. In this way, proposer boost weighting propagates to the boosted block's ancestors in the same way as vote weights do.

The weight that proposer boost adds to the block's branch is a percentage PROPOSER_SCORE_BOOST of the total effective balance of all validators due to attest at that slot. Rather, it is an approximation to the total effective balance for that slot, derived by dividing the total effective balance of all validators by the number of slots per epoch.

The value of PROPOSER_SCORE_BOOST has changed over time before settling at its current 40%. See the description there for the history, and links to how the current value was calculated.

Proposer boost and late blocks

A side-effect of proposer boost is that it enables clients to reliably re-org out (orphan) blocks that were published late. Instead of building on a late block, the proposer can choose to build on the late block's parent.

A block proposer is supposed to publish its block at the start of the slot, so that it has time to be received and attested to by the whole committee within the first four seconds. However, post-merge, it can be profitable to delay block proposals by several seconds in order to collect more transaction income and better extractable value opportunities. Although blocks published five or six seconds into a slot will not gain many votes, they are still likely to remain canonical under the basic consensus spec. As long as the next block proposer receives the late block by the end of the slot, it will usually build on it as the best available head.25 This is undesirable as it punishes the vast majority of honest validators, that (correctly) voted for an empty slot, by depriving them of their reward for correct head votes, and possibly even penalising them for incorrect target votes at the start of an epoch.

Without proposer boost, it is a losing strategy for the next proposer not to build on a block that it received late. Although the late block may have few votes, it has more votes than your block initially, so validators will still attest to the late block as the head of the chain, keeping it canonical and orphaning the alternative block that you built on its parent.

With proposer boost, as long as the late block has fewer votes than the proposer boost percentage, the honest proposer can be confident that its alternative block will win the fork choice for long enough that the next proposer will build on that rather than on the late block it skipped.

Diagram showing a proposer choosing whether to build on a late block or its parent.

Block BB was published late, well after the 4 second attestation cut-off time. However, it still managed to acquire a few attestations (say, 10% of the committee) due to dishonest or misconfigured validators. Should the next proposer build C1C_1 on top of the late block, or C2C_2 on top of its parent?

Diagram showing that without proposer score boosting a proposer should build on the late block.

Without proposer boost, it only makes sense to build C1C_1, on top of the late block BB. Since BB has some weight, albeit small, the top branch will win the fork choice (if the network is behaving synchronously at the time). Block C2C_2 would be orphaned.

Diagram showing that with proposer score boosting a proposer may build on the late block's parent.

With proposer boost, the proposer of CC can safely publish either C1C_1 or C2C_2. Due to the proposer score boost of 40%, it is safe to publish block C2C_2 that orphans BB since the lower branch will have greater weight during the slot.

An implementation of this strategy in the Lighthouse client seems to have been effective in reducing the number of late blocks on the network. Publishing of late blocks is strongly disincentivised when they are likely to be orphaned. It may be adopted as standard behaviour in the consensus specs at some point, but remains optional for the time-being. Several safe-guards are present in order to avoid liveness failures.

Note that Proposer boost does not in general allow validators to re-org out timely blocks (that is, an ex-post reorg). A timely block ought to gain enough votes from the committees that it will always remain canonical.

Alternatives to proposer boost

Proposer boost is not a perfect solution to balancing attacks or ex-ante reorgs. It makes ex-post reorgs easier to accomplish; it does not scale with participation, meaning that if only 40% of validators are online, then proposers can reorg at will; it can fail when an attacker controls several consecutive slots over which to store up votes.

Some changes to, or replacements for, LMD GHOST have been suggested that do not require proposer score boosting.

View-merge26 is a mechanism in which attesters freeze their fork choice some time Δ\Delta before the end of a slot. The next proposer does not freeze its fork choice, however. The assumed maximum network delay is Δ\Delta, so the proposer will see all votes in time, and it will circulate a summary of them to all validators, contained within its block. This allows the whole network to synchronise on a common view. Balancing attacks rely on giving two halves of the network different views, and would be prevented by view-merge.

The Goldfish protocol, described in the paper No More Attacks on Proof-of-Stake Ethereum?, builds on view-merge (called "message buffering" there) and adds vote expiry so that head block votes expire almost immediately (hence the name - rightly or wrongly, goldfish are famed for their short memories). The resulting protocol is provably reorg resilient and supports fast confirmations.

Both view-merge and Goldfish come with nice proofs of their properties under synchronous conditions, which improve on Gasper under the same conditions. However, they may not fare so well under more realistic asynchronous conditions. The original view-merge article says of latency greater than 2 seconds, "This is bad". One of the authors of the Goldfish paper has said that Goldfish "is extremely brittle to asynchrony, allowing for catastrophic failures such as arbitrarily long reorgs"27, and elsewhere, "even a single slot of asynchrony can lead to a catastrophic failure, jeopardizing the safety of any previously confirmed block". At least with proposer boost, we know that it only degrades to normal Gasper under conditions of high latency.

Francesco D'Amato argues in Reorg resilience and security in post-SSF LMD-GHOST that the real origin of the reorg issues with LMD GHOST is our current committee-based voting: "The crux of the issue is that honest majority of the committee of a slot does not equal a majority of the eligible fork-choice weight", since an adversary is able to influence the fork choice with votes from other slots. The ultimate cure for this would be single slot finality (SSF), in which all validators vote at every slot. SSF is a long way from being practical today, but a candidate for its fork choice is RLMD-GHOST (Recent Latest Message Driven GHOST), which expires votes after a configurable time period.

Used by get_head()
Uses get_active_validator_indices(), get_ancestor(), get_total_active_balance()
See also on_tick(), on_block(), PROPOSER_SCORE_BOOST

get_voting_source

def get_voting_source(store: Store, block_root: Root) -> Checkpoint:
    """
    Compute the voting source checkpoint in event that block with root ``block_root`` is the head block
    """
    block = store.blocks[block_root]
    current_epoch = compute_epoch_at_slot(get_current_slot(store))
    block_epoch = compute_epoch_at_slot(block.slot)
    if current_epoch > block_epoch:
        # The block is from a prior epoch, the voting source will be pulled-up
        return store.unrealized_justifications[block_root]
    else:
        # The block is not from a prior epoch, therefore the voting source is not pulled up
        head_state = store.block_states[block_root]
        return head_state.current_justified_checkpoint

If the given block (which is a leaf block in the Store's block tree) is from a prior epoch, then return its unrealised justification. Otherwise return the justified checkpoint from its post-state (its realised justification).

Returning the unrealised justification is called "pulling up" the block (or "pulling the tip of a branch") as it is equivalent to running the end-of-epoch state transition accounting on the block's post-state: the block is notionally pulled up from its actual slot to the first slot of the next epoch.

The Casper FFG source vote is the checkpoint that a validator believes is the highest justified at the time of the vote. As such, this function returns the source checkpoint that validators with this block as head will use when casting a Casper FFG vote in the current epoch. This has an important role in filter_block_tree() and is used in the formal proof of non-self-slashability.

Used by filter_block_tree()
Uses compute_epoch_at_slot()

filter_block_tree

Note: External calls to filter_block_tree (i.e., any calls that are not made by the recursive logic in this function) MUST set block_root to store.justified_checkpoint.

The only external call to filter_block_tree() comes from get_filtered_block_tree(), which uses store.justified_checkpoint.root. So we're all good. This is a requirement of Hybrid LMD GHOST - it enforces Casper FFG's fork choice rule.

def filter_block_tree(store: Store, block_root: Root, blocks: Dict[Root, BeaconBlock]) -> bool:
    block = store.blocks[block_root]
    children = [
        root for root in store.blocks.keys()
        if store.blocks[root].parent_root == block_root
    ]

    # If any children branches contain expected finalized/justified checkpoints,
    # add to filtered block-tree and signal viability to parent.
    if any(children):
        filter_block_tree_result = [filter_block_tree(store, child, blocks) for child in children]
        if any(filter_block_tree_result):
            blocks[block_root] = block
            return True
        return False

    current_epoch = compute_epoch_at_slot(get_current_slot(store))
    voting_source = get_voting_source(store, block_root)

    # The voting source should be at the same height as the store's justified checkpoint
    correct_justified = (
        store.justified_checkpoint.epoch == GENESIS_EPOCH
        or voting_source.epoch == store.justified_checkpoint.epoch
    )

    # If the previous epoch is justified, the block should be pulled-up. In this case, check that unrealized
    # justification is higher than the store and that the voting source is not more than two epochs ago
    if not correct_justified and is_previous_epoch_justified(store):
        correct_justified = (
            store.unrealized_justifications[block_root].epoch >= store.justified_checkpoint.epoch and
            voting_source.epoch + 2 >= current_epoch
        )

    finalized_slot = compute_start_slot_at_epoch(store.finalized_checkpoint.epoch)
    correct_finalized = (
        store.finalized_checkpoint.epoch == GENESIS_EPOCH
        or store.finalized_checkpoint.root == get_ancestor(store, block_root, finalized_slot)
    )
    # If expected finalized/justified, add to viable block-tree and signal viability to parent.
    if correct_justified and correct_finalized:
        blocks[block_root] = block
        return True

    # Otherwise, branch not viable
    return False

The filter_block_tree() function is at the heart of how LMD GHOST and Casper FFG are bolted together.

The basic structure is fairly simple. Given a block, filter_block_tree() recursively walks the Store's block tree visiting the block's descendants in depth-first fashion. When it arrives at a leaf block (the tip of a branch), if the leaf block is "viable" as head then it and all its ancestors (the whole branch) will be added to the blocks list, otherwise the branch will be ignored.

In other words, the algorithm prunes out branches that terminate in an unviable head block, and keeps branches that terminate in a viable head block.

A diagram showing the pruning of nonviable branches.

Block JJ is the Store's justified checkpoint. There are four candidate head blocks descended from it. Two are viable (VV), and two are nonviable (NVNV). Blocks in branches terminating at viable heads are returned by the filter; blocks in branches terminating at nonviable heads are filtered out.

Viability

What dictates whether a leaf block is viable or not?

Pre-Capella, there was a fairly straightforward requirement for a leaf block to be a viable head block: viable head blocks had a post-state that agreed with the Store about the justified and finalised checkpoints. This was encapsulated in the following code from the Bellatrix spec,

    correct_justified = (
        store.justified_checkpoint.epoch == GENESIS_EPOCH
        or head_state.current_justified_checkpoint == store.justified_checkpoint
    )
    correct_finalized = (
        store.finalized_checkpoint.epoch == GENESIS_EPOCH
        or head_state.finalized_checkpoint == store.finalized_checkpoint
    )
    # If expected finalized/justified, add to viable block-tree and signal viability to parent.
    if correct_justified and correct_finalized:
        blocks[block_root] = block
        return True

The code we have in the Capella update is considerably more complex and less intuitive. But before we get to that we need to take a step back and discuss why we should filter the block tree at all.

Why prune unviable branches?

Filtering the block tree like this ensures that the Casper FFG fork choice rule, "follow the chain containing the justified checkpoint of the greatest height", is applied to the block tree before the LMD GHOST fork choice is evaluated.

Very early versions of the spec considered the tip of any branch descended from the Store's justified checkpoint as a potential head block. However, a scenario was identified in which this could result in a deadlock, in which finality would not be able to advance without validators getting themselves slashed - a kind of liveness failure28.

The filter_block_tree() function was added as a fix for this issue. Given a Store and a block root, filter_block_tree() returns the list of all the blocks that we know about in the tree descending from the given block, having pruned out any branches that terminate in a leaf block that is not viable in some sense.

To illustrate the problem, consider the situation shown in the following diagrams, based on the original description of the issue. The context is that there is an adversary controlling 18% of validators that takes advantage of (or causes) a temporary network partition. We will illustrate the issue mostly in terms of checkpoints, and omit the intermediate blocks that carry the attestations - you can mentally insert these as necessary.

We begin with a justified checkpoint AA that all nodes agree on.

Due to the network partition, only 49% of validators, plus the adversary's 18%, see checkpoint BB. They all make Casper FFG votes [AB][A \rightarrow B], thereby justifying BB. A further checkpoint C1C_1 is produced on this branch, and the 49% that are honest validators dutifully make the Casper FFG vote [BC1][B \rightarrow C_1], but the adversary does not, meaning that C1C_1 is not justified. Validators on this branch see h1h_1 as the head block, and have a highest justified checkpoint of BB.

A diagram illustrating the first step in a liveness attack on the unfiltered chain, making the first branch.

The large blocks represent checkpoints. After checkpoint AA there is a network partition: 49% of validators plus the adversary see checkpoints BB and C1C_1. Casper votes are shown by the dashed arrows. The adversary votes for BB, but not for C1C_1.

The remaining 33% of validators do not see checkpoint BB, but see C2C_2 instead and make Casper FFG votes [AC2][A \rightarrow C_2] for it. But this is not enough votes to justify C2C_2. Checkpoint D2D_2 is produced on top of C2C_2, and a further block h2h_2. On this branch, h2h_2 is the head of the chain according to LMD GHOST, and AA remains the highest justified checkpoint.

A diagram illustrating the second step in a liveness attack on the unfiltered chain, making the second branch.

Meanwhile, the remaining 33% of validators do not see the branch starting at BB, but start a new branch containing C2C_2 and its descendants. They do not have enough collective weight to justify any of the checkpoints.

Now for the cunning part. The adversary switches its LMD GHOST vote (and implicitly its Casper FFG vote, although that does not matter for this exercise) from the first branch to the second branch, and lets the validators in the first branch see the blocks and votes on the second branch.

Block h2h_2 now has votes from the majority of validators – 33% plus the adversary's 18% – so all honest validators should make it their head block.

However, the justified checkpoint on the h2h_2 branch remains at AA. This means that the 49% of validators who made Casper FFG vote [BC][B \rightarrow C] cannot switch their chain head from h1h_1 to h2h_2 without committing a Casper FFG surround vote, and thereby getting slashed. Switching branch would cause their highest justified checkpoint to go backwards. Since they have previously voted [BC1][B \rightarrow C_1], they cannot now vote [AX][A \rightarrow X] where XX has a height greater than C1C_1, which they must do if they were to switch to the h2h_2 branch.

A diagram illustrating the third step in a liveness attack on the unfiltered chain, changing the chain head.

The adversary switches to the second branch, giving h2h_2 the majority LMD GHOST vote. This deadlocks finalisation: the 49% who made Casper FFG vote [BC1][B \rightarrow C_1] cannot switch to h2h_2 without being slashed.

In conclusion, the chain can no longer finalise (by creating higher justified checkpoints) without a substantial proportion of validators (at least 16%) being willing to get themselves slashed.

It should never be possible for the chain to get into a situation in which honest validators, following the rules of the protocol, end up in danger of being slashed. The situation here arises due to a conflict between the Casper FFG fork choice (follow the chain containing the justified checkpoint of the greatest height) and the LMD GHOST fork choice (which, in this instance, ignores that rule). It is a symptom of the clunky way in which the two have been bolted together.

The chosen fix for all this is to filter the block tree before applying the LMD GHOST fork choice, so as to remove all "unviable" branches from consideration. That is, all branches whose head block's state does not agree with me about the current state of justification and finalisation.

A diagram showing that filter block tree prunes out the conflicting branch for validators following the first branch.

When validators that followed branch 1 apply filter_block_tree(), branch 2 is pruned out (as indicated by the dashed lines). This is because their Store has BB as the best justified checkpoint, while branch 2's leaf block has a state with AA as the justified checkpoint. For these validators h2h_2 is no longer a candidate head block.

With this fix, the chain will recover the ability to finalise when the validators on the second branch eventually become aware of the first branch. On seeing h1h_1 and its ancestors, they will update their Stores' justified checkpoints to BB and mark the h2h_2 branch unviable.

Unrealised justification

A major feature of the Capella update to the fork choice specification is the logic for handling "unrealised justification" when filtering the block tree.

Several issues had arisen in the former fork choice spec. First, an unrealised justification reorg attack that allowed the proposer of the first block of an epoch to easily fork out up to nine blocks from the end of the previous epoch. A variant of that attack was also found to be able to cause validators to make slashable attestations - the very issue the filter is intended to prevent. Second, a justification withholding attack that an adversary could use to reorg arbitrary numbers of blocks at the start of an epoch.

The root issue is that, within the consensus layer's state transition, the calculations that update justification and finality are done only at epoch boundaries. An adversary had a couple of ways they could use this to filter out competing branches within filter_block_tree(). Essentially, in not accounting for unrealised justifications, the filtering was being applied too aggressively.

To be clear, both of the attacks described here apply to the old version of filter_block_tree() and have been remedied in the current release. This is the old, much simpler, code for evaluating correct_justified and correct_finalized,

    correct_justified = (
        store.justified_checkpoint.epoch == GENESIS_EPOCH
        or head_state.current_justified_checkpoint == store.justified_checkpoint
    )
    correct_finalized = (
        store.finalized_checkpoint.epoch == GENESIS_EPOCH
        or head_state.finalized_checkpoint == store.finalized_checkpoint
    )

This meant that the tip of a branch was included for consideration if (a) the justified checkpoint in its post-state matched that in the store, and (b) the finalised checkpoint in its post-state matched that in the store. These nice simple criteria have been changed to the mess we have today, which we'll look at in a moment. But first, let's see what was wrong with the old criteria.

Unrealised justification reorg

The unrealised justification reorg allowed an adversary assigned to propose a block in the first slot of an epoch to reorg out a chain of up to nine blocks at the end of the previous epoch.

The key to this is the idea of unrealised justification. Towards the end of an epoch (within the last third of an epoch, that is, the last nine slots), the beacon chain might have gathered enough Casper FFG votes to justify the checkpoint at the start of that epoch. However, justification and finalisation calculations take place only at epoch boundaries, so the achieved justification is "unrealised": until the end of the epoch, all the blocks will continue have a post-state justification that points to an earlier checkpoint.

A diagram showing the setup for an unrealised justification reorg scenario.

The solid vertical lines are epoch boundaries, and the squares C1C_1 and C2C_2 are their checkpoints. A block's JJ value shows the justified checkpoint in its post-state. Its UU value is the hypothetical unrealised justification. During an epoch, the chain may gather enough Casper FFG votes to justify a new checkpoint, but justification in the beacon state happens only at epoch boundaries, so it is unrealised in the interim. Block YY is clearly the head block.

When the adversary is the proposer in the first slot of an epoch, it could have used the unrealised justification in the previous epoch to fork out the last blocks of that epoch - up to around nine of them, depending on the FFG votes the adversary's block contains. By building a competing head block the adversary could trick filter_block_tree() into filtering out the previous head branch from consideration.

A diagram showing how the adversary executes the unrealised justification reorg.

The adversary adds a block ZZ in the first slot of the next epoch. It builds on WW, which has unrealised justification. At the epoch boundary, the state's justified checkpoint is calculated, so WW's post-state has C2C_2. In the former fork choice, only branches with tips that agreed with the Store about the justified checkpoint could be considered. On that basis, the branch ending in YY would have been excluded by the filter, making ZZ the head, even though it might have zero LMD GHOST support. Blocks XX and YY would be orphaned (reorged out).

Unrealised justification deadlock

It also became apparent that the pre-Capella formulation of filter_block_tree() did not fully prevent the possibility of deadlock - the very thing that filtering the block tree is intended to prevent. Deadlock is a situation in which honest validators are forced to choose between making a slashable attestation or not voting at all.

The setup and attack is described in Aditya Asgaonkar's document, and is a variant of the reorg above. The original deadlock attack relied on the network being partitioned, so that validators had split views. This newer deadlock attack did not need a network partition, but under some fairly specific unrealised justification conditions, the adversary could make the justified checkpoint go backwards. Doing this after some honest validators had already used the higher checkpoint as their Casper FFG source vote forced them subsequently to either make a surround vote, or not to vote.

Justification withholding attack

The justification withholding attack was similar to the unrealised justification reorg in that it involved using unrealised justification to get filter_block_tree() to exclude other branches besides the adversary's.

In this attack, the adversary needs to have several proposals in a row at the end of an epoch - enough that, if the adversary does not publish the blocks, the epoch does not get justified by the time it finishes. That is, without the adversary's blocks, there is no unrealised justification, with the adversary's blocks there would be unrealised justification.

A diagram showing how an adversary sets up a justification withholding attack.

The adversary has a string of proposals at the end of an epoch. These blocks contain enough FFG votes to justify the epoch's checkpoint C2C_2, but the adversary withholds them for now.

The remainder of the chain is unaware of the adversary's blocks, so continues to build as if they were skipped slots.

A diagram showing the chain progressing after the setup of the justification withholding attack, but before its execution.

The remainder of the validators continue to build blocks AA and BB at the start of the next epoch. Without the adversary's blocks, checkpoint 2 is not justified, so AA and BB have C1C_1 as justified in their post-states (their unrealised justification is irrelevant here). Block BB is the chain's head.

The adversary has a block proposal at some point in epoch 3 - it does not matter when.

A diagram showing the execution of the justification withholding attack.

When the adversary publishes block ZZ in epoch 3, it releases its withheld blocks at the same time. Block ZZ has a post-state justified checkpoint of C2C_2 (updated at the epoch boundary). Under the old filter_block_tree() that would have excluded BB from consideration as head, and the adversary's block ZZ would have become head, even with no LMD support.

Viable and unviable branches

The block tree filtering is done by checking that blocks at the tip of branches have the "correct" justification and finalisation in some sense. Both the correct_justified and correct_finalised flags must be true for the branch to be considered viable.

correct_justified

The newer, more complex, correct_justified evaluation from the Capella update is as follows.

    current_epoch = compute_epoch_at_slot(get_current_slot(store))
    voting_source = get_voting_source(store, block_root)

    # The voting source should be at the same height as the store's justified checkpoint
    correct_justified = (
        store.justified_checkpoint.epoch == GENESIS_EPOCH
        or voting_source.epoch == store.justified_checkpoint.epoch # A
    )

    # If the previous epoch is justified, the block should be pulled-up. In this case, check that unrealized
    # justification is higher than the store and that the voting source is not more than two epochs ago
    if not correct_justified and is_previous_epoch_justified(store): # B
        correct_justified = (
            store.unrealized_justifications[block_root].epoch >= store.justified_checkpoint.epoch and # C
            voting_source.epoch + 2 >= current_epoch # D
        )

I'm not going to pretend that I understand this fully - it seems very far from intuitive, and its correctness far from obvious29. For now I will quote some explanation that Aditya shared with me directly.

The correct_justified condition ensures that: (a) we pick a head block that has a "good" justification in its chain, and (b) that validators can vote for the chosen head block without the risk of creating slashable messages.

I describe (a) here informally, but you can think of the fork choice as a heuristic to choose where to vote so that we advance our finalized checkpoint as fast as possible. Generally, this means we vote on the highest justified checkpoint we know of, but the "that we know of" part is tricky because of the nuances of unrealized justifications and the associated reorg attacks.

Now, if you ignore unrealized justification and the reorg attacks, this first appearance of correct_justified [Line A] is sufficient to address (a) & (b). Then, to fix the reorg attacks, we add an additional version of correct_justified for pulled-up blocks, where the first line [line C] addresses (a) and the second line [line D] addresses (b).

Recall that the chief goal of filtering the block tree is to avoid honest validators being forced to make surround votes in Casper FFG, as these are slashable. However, earlier remedies for this were too eager in filtering out candidate heads, and an adversary could use unrealised justification to force reorgs, and even the kind of self-slashing we want to avoid.

The current fork choice apparatus preserves two key properties30.

  1. The voting_source.epoch of a candidate head block is always less than or equal to the store.justified_checkpoint.epoch.
    • This is because on_block() always calls update_checkpoints(), and the way that the get_voting_source() function is constructed.
    • The voting_source.epoch is the Casper FFG source vote that validators with that block as head will use when making a Casper FFG vote in the current epoch.
  2. store.justified_checkpoint.epoch never decreases (it is monotonically increasing).
    • It is only ever written to in update_checkpoints(), and it is easy to see from there that this is true.

With respect to the line I've marked A in the source code, if the voting source matches the Store's justified checkpoint, then all is good, we have no reason not to consider the block as head, and we can short-circuit the rest of the logic. By property 2, the Store's justified checkpoint never decreases, so a validator's Casper FFG vote based on this cannot ever be a surround vote. (Since the target vote strictly increases, it cannot be a surrounded vote either.)

A diagram showing that it is always safe to vote when the voting source is the same as the store's justified checkpoint.

When the voting source is the same as the store's justified checkpoint, it is always safe to vote. The store's justified checkpoint never decreases, so we cannot commit a surround vote.

If the block is not a viable head by the first criterion, it might still be a viable head by the second (line D). Recall that the reorg attacks above rely on the adversary taking advantage of unrealised justification to update the Store's justified checkpoint, leaving the previous head of the chain "stale" with respect to its realised justification, although based on its unrealised justification it would still be viable. To avoid this, we wish to safely include as many viable heads as possible.

We know that any Casper FFG vote we make at this point will have voting_source.epoch strictly less than store.justified_checkpoint.epoch, by property 1, and since we already dealt with the equality case.

Line D in the source code says that it is safe to allow votes where voting_source.epoch == current_epoch - 2 and voting_source.epoch == current_epoch - 1. Any honest vote's target epoch will be current_epoch, so the gap between source and target is at most two epochs, which is not enough to surround a previous vote. That is, a vote [s2t2][s_2 \rightarrow t_2] that surrounds a previous vote [s1t1][s_1 \rightarrow t_1] requires that s2<s1<t1<t2s_2 < s_1 < t_1 < t_2. This is not possible if t2s22t_2 - s_2 \le 2. This exception is required for the analysis of the safe block confirmation rule, and is discussed in section 3.3 of the Confirmation Rule paper.

A diagram showing that it is safe to vote from two epochs ago.

When the voting source is within two epochs of the current epoch then it is safe to vote as a surround vote must encompass at least two intervening checkpoints.

Both line B (the condition on justification of the previous epoch) and line C seem to be unnecessary, and may be removed in future. See sections 3.3 and 3.4 of the (draft) A Confirmation Rule for the Ethereum Consensus Protocol paper for discussion of this. Note, though, that the proof of the honest viable head property described below relies on the weaker condition that store.justified_checkpoint.epoch >= current_epoch - 2 (that either the previous or previous but one epoch is justified).

correct_finalized

The Capella update's change to correct_finalized is more limited, and more intuitive.

    finalized_slot = compute_start_slot_at_epoch(store.finalized_checkpoint.epoch)
    correct_finalized = (
        store.finalized_checkpoint.epoch == GENESIS_EPOCH
        or store.finalized_checkpoint.root == get_ancestor(store, block_root, finalized_slot)
    )

This simply ensures that a viable block is descended from the Store's finalised checkpoint

The previous version made use the post-state of the block being considered for viability, which caused it to suffer from similar complications to correct_justified with respect to unrealised finality. We don't actually care about the block's post-state view of finalisation, since finality is a global property: as long as the block is descended from the finalised checkpoint it should be eligible for consideration to become viable.

The correct_finalized check might appear redundant at first sight since we are always filtering based on a tree rooted at the last justified checkpoint, which (in the absence of a mass slashing) must be descended from the last finalised checkpoint. The check, however, guarantees the further condition that a node's own view of the finalised checkpoint is irreversible, even if there were to be a mass slashing - this is the irreversible local finality formal property in the next section. Maintaining this invariant is a huge help in building efficient clients - we are free to forget about everything prior to the last finalised checkpoint.

Formal proofs

As referenced in the Ethereum Foundation's disclosure on the Capella fork choice spec updates, some properties of the new fork choice have been formally verified by Roberto Saltini and his team at Consensys.

This formal verification process involves selecting some properties that we wish to be true for the fork choice, and manually constructing proofs that they are always preserved by the specification. It is a much more robust and rigorous process than manual testing, fuzz testing, or general hand-waving that have hitherto been the main approaches.

Four properties were proved in the documents.

  • Non-Self-Slashability
  • Honest Viable Head
    • Assuming a synchronous network, current_epoch - 2 being justified, and honest nodes having received enough attestations to justify the highest possible justified checkpoint, any block proposed by an honest node during the current epoch that is a descendent of the highest possible justified checkpoint is included in the output of get_filtered_block_tree().
  • Deadlock Freedom
    • A distributed system running the Ethereum protocol can never end up in a state in which it is impossible to finalise a new epoch.
  • Irreversible Local Finality
    • A block, once finalized in the local view of an honest validator, is never reverted from the canonical chain in the local view.

Given the complexity of reasoning about the fork choice, and it's rather chequered history, it is hugely reassuring now to have these proofs of correctness31.

Conclusion

This has been a long section on a short function. As I said at the start of the section, filter_block_tree() is at the heart of how LMD GHOST and Casper FFG are bolted together, and I think it has surprised everybody how many complexities lurk here.

As an exercise for the reader, we can imagine life without having to filter the block tree. Potuz has documented some thoughts on this in, Fork choice without on-state FFG filtering.

Ultimately, as with proposer boost, the complexities around the Gasper fork choice largely arise from our slot-based voting, with votes accumulated gradually through an epoch. This results in unrealised justification and the rest of it. The long-term fix is also probably the same: moving to single slot finality.

Used by get_filtered_block_tree(), filter_block_tree() (recursively)
Uses filter_block_tree() (recursively), compute_epoch_at_slot(), get_voting_source(), is_previous_epoch_justified(), compute_start_slot_at_epoch(), get_ancestor()

get_filtered_block_tree

def get_filtered_block_tree(store: Store) -> Dict[Root, BeaconBlock]:
    """
    Retrieve a filtered block tree from ``store``, only returning branches
    whose leaf state's justified/finalized info agrees with that in ``store``.
    """
    base = store.justified_checkpoint.root
    blocks: Dict[Root, BeaconBlock] = {}
    filter_block_tree(store, base, blocks)
    return blocks

A convenience wrapper that passes the Store's justified checkpoint to filter_block_tree(). On returning, the blocks dictionary structure will contain the blocks from all viable branches rooted at that checkpoint, and nothing that does not descend from that checkpoint. For the meaning of "viable", see above.

Used by get_head()
Uses filter_block_tree()

get_head

def get_head(store: Store) -> Root:
    # Get filtered block tree that only includes viable branches
    blocks = get_filtered_block_tree(store)
    # Execute the LMD-GHOST fork choice
    head = store.justified_checkpoint.root
    while True:
        children = [
            root for root in blocks.keys()
            if blocks[root].parent_root == head
        ]
        if len(children) == 0:
            return head
        # Sort by latest attesting balance with ties broken lexicographically
        # Ties broken by favoring block with lexicographically higher root
        head = max(children, key=lambda root: (get_weight(store, root), root))

get_head() encapsulates the fork choice rule: given a Store it returns a head block.

The fork choice rule is objective in that, given the same Store, it will always return the same head block. But the overall process is subjective in that each node on the network will tend to have a different view, that is, a different Store, due to delays in receiving attestations or blocks, or having seen different sets of attestations or blocks because of network asynchrony or an attack.

Looking first at the while True loop, this implements LMD GHOST in its purest form. Starting from a given block (which would be the genesis block in unmodified LMD GHOST), we find the weights of the children of that block. We choose the child block with the largest weight and repeat the process until we end up at a leaf block (the tip of a branch). That is, we Greedily take the Heaviest Observed Sub-Tree, GHOST. Any tie between two child blocks with the same weight is broken by comparing their block hashes, so we end up at a unique leaf block - the head that we return.

Diagram of a block tree showing the weight of each block.

get_head() starts from the root block, AA, of a block tree. The numbers show each block's weight, which is its latest attesting balance - the sum of the effective balances of the validators that cast their latest vote for that block. Proposer boost can temporarily increase the latest block's score (not shown).

Diagram of a block tree showing the weight of each block and the weight of each subtree.

The get_weight() function when applied to a block returns the total weight of the subtree of the block and all its descendants. These weights are shown on the lines between child and parent blocks.

Diagram of a block tree showing the branch chosen by the GHOST rule.

Given a block, the loop in get_head() considers its children and selects the one that roots the subtree with the highest weight. It repeats the process with the heaviest child block32 until it reaches a block with no children. In this example, it would select the branch ACEA \leftarrow C \leftarrow E, returning EE as the head block.

Hybrid LMD GHOST

What we've just described is the pure LMD GHOST algorithm. Starting from the genesis block, it walks the entire block tree, taking the heaviest branch at each fork until it reaches a leaf block.

What is implemented in get_head() however, is a modified form of this that the Gasper paper33 refers to as "hybrid LMD GHOST" (HLMD GHOST). It is not pure LMD GHOST, but LMD GHOST modified by the Casper FFG consensus.

    # Get filtered block tree that only includes viable branches
    blocks = get_filtered_block_tree(store)
    # Execute the LMD-GHOST fork choice
    head = store.justified_checkpoint.root

Specifically, rather than starting to walk the tree from the genesis block, we start from the last justified checkpoint, and rather than considering all blocks that the Store knows about, we first filter out "unviable" branches with get_filtered_block_tree().

This is the point at which the Casper FFG fork choice rule, "follow the chain containing the justified checkpoint of the greatest height", meets the LMD GHOST fork choice rule. The former modifies the latter to give us the HLMD GHOST fork choice rule.

Uses get_filtered_block_tree(), get_weight()

update_checkpoints

def update_checkpoints(store: Store, justified_checkpoint: Checkpoint, finalized_checkpoint: Checkpoint) -> None:
    """
    Update checkpoints in store if necessary
    """
    # Update justified checkpoint
    if justified_checkpoint.epoch > store.justified_checkpoint.epoch:
        store.justified_checkpoint = justified_checkpoint

    # Update finalized checkpoint
    if finalized_checkpoint.epoch > store.finalized_checkpoint.epoch:
        store.finalized_checkpoint = finalized_checkpoint

Update the checkpoints in the store if either of the given justified or finalised checkpoints is newer.

Justification and finalisation are supposed to be "global" properties of the chain, not specific to any one branch, so we keep our Store up to date with the highest checkpoints we've seen.

Note that, by construction, the Store's justified and finalised checkpoints can only increase monotonically. The former is important for the formal proof of non-self-slashability.

Used by compute_pulled_up_tip(), on_tick_per_slot(), on_block()

update_unrealized_checkpoints

def update_unrealized_checkpoints(store: Store, unrealized_justified_checkpoint: Checkpoint,
                                  unrealized_finalized_checkpoint: Checkpoint) -> None:
    """
    Update unrealized checkpoints in store if necessary
    """
    # Update unrealized justified checkpoint
    if unrealized_justified_checkpoint.epoch > store.unrealized_justified_checkpoint.epoch:
        store.unrealized_justified_checkpoint = unrealized_justified_checkpoint

    # Update unrealized finalized checkpoint
    if unrealized_finalized_checkpoint.epoch > store.unrealized_finalized_checkpoint.epoch:
        store.unrealized_finalized_checkpoint = unrealized_finalized_checkpoint

The counterpart to update_checkpoints() for unrealised justified and finalised checkpoints.

Used by compute_pulled_up_tip()

Pull-up tip helpers

compute_pulled_up_tip
def compute_pulled_up_tip(store: Store, block_root: Root) -> None:
    state = store.block_states[block_root].copy()
    # Pull up the post-state of the block to the next epoch boundary
    process_justification_and_finalization(state)

    store.unrealized_justifications[block_root] = state.current_justified_checkpoint
    update_unrealized_checkpoints(store, state.current_justified_checkpoint, state.finalized_checkpoint)

    # If the block is from a prior epoch, apply the realized values
    block_epoch = compute_epoch_at_slot(store.blocks[block_root].slot)
    current_epoch = compute_epoch_at_slot(get_current_slot(store))
    if block_epoch < current_epoch:
        update_checkpoints(store, state.current_justified_checkpoint, state.finalized_checkpoint)

compute_pulled_up_tip() is called for every block processed in order to maintain the update_unrealized_checkpoints map. It was added in the Capella spec update.

The major work in this routine is in the call to process_justification_and_finalization(). In the state transition, this is called once per epoch. Now we are calling it once per block, which would add significant load if implemented naively.

Since the state transition only calls process_justification_and_finalization() at epoch boundaries, the beacon state's justification and finalisation information cannot change mid-epoch. However, Casper FFG votes accumulate throughout the progress of the epoch, and at some point prior to the end of the epoch, enough votes will usually have been included on chain to justify a new checkpoint. When this occurs, we call it "unrealised justification" since it is not yet reflected on chain (in the beacon state). Unrealised justification reflects what the beacon state would be if the end of epoch accounting were to be run immediately on the block - hence the "pulled up tip" naming.

We simulate "pulling up" the block to the next epoch boundary to find out what the justification and finalisation status would be. When the next epoch begins, the unrealised values will become realised.

A diagram showing how unrealised justification becomes realised when a block is "pulled up" to the next epoch.

Each block has a JJ value, which is the justified checkpoint its post-state knows about. JJ is updated only at epoch boundaries. We notionally add a UU value, which is the justified checkpoint that would be in the block's post-state were it to be "pulled up" to the next epoch boundary, where the beacon state's justification and finalisation calculations are done (shown as WW').

As illustrated, the unrealised justification, UU, can differ from the realised justification, JJ, due to unprocessed Casper FFG votes accumulating during an epoch. However, an epoch's own checkpoint cannot gain unrealised justification until at least 2/3 of the way through an epoch (23 slots, since attestations are included one slot after the slot they attest to). That is, the earliest that block WW could occur is in slot 22 of epoch 2 (slot counting is zero-based).

After adding the block and its unrealised justification checkpoint to the store.unrealized_justifications map, the Store's unrealized_justified_checkpoint and unrealized_finalized_checkpoint are updated if the block's values are newer.

If the block is from a previous epoch, then its justification and finalisation are no longer unrealised, since the beacon state has gone through an actual epoch transition since then, so we can update the Store's justified_checkpoint and finalized_checkpoint if the block has newer ones.

Used by on_block()
Uses process_justification_and_finalization(), update_unrealized_checkpoints(), compute_epoch_at_slot(), update_checkpoints()

on_tick helpers

on_tick_per_slot
def on_tick_per_slot(store: Store, time: uint64) -> None:
    previous_slot = get_current_slot(store)

    # Update store time
    store.time = time

    current_slot = get_current_slot(store)

    # If this is a new slot, reset store.proposer_boost_root
    if current_slot > previous_slot:
        store.proposer_boost_root = Root()

    # If a new epoch, pull-up justification and finalization from previous epoch
    if current_slot > previous_slot and compute_slots_since_epoch_start(current_slot) == 0:
        update_checkpoints(store, store.unrealized_justified_checkpoint, store.unrealized_finalized_checkpoint)

The on_tick_per_slot() helper is called at least once every slot. If a tick hasn't been processed for multiple slots, then the on_tick() handler calls it repeatedly to process (synthetic) ticks for the those slots. This ensures that update_checkpoints() is called when there is no tick processed during the first slot of an epoch.

The on_tick_per_slot() helper has three duties,

  • updating the time,
  • resetting proposer boost, and
  • updating checkpoints on epoch boundaries.
Updating the time
    # update store time
    store.time = time

The store has a notion of the current time that is used when calculating the current slot and when applying proposer boost. The time parameter does not need to be very granular. If it weren't for proposer boost, it would be fine to measure time in whole slots, at least within the fork choice34.

Resetting proposer boost
    # Reset store.proposer_boost_root if this is a new slot
    if current_slot > previous_slot:
        store.proposer_boost_root = Root()

Proposer boost is a defence against balancing attacks on LMD GHOST. It rewards timely blocks with extra weight in the fork choice, making it unlikely that an honest proposer's block will become orphaned.

The Store's proposer_boost_root field is set in the on_block() handler when a block is received and processed in a timely manner (within the first four seconds of its slot). For the remainder of the slot this allows extra weight to be added to the block in get_weight().

The logic here resets proposer_boost_root to a default value at the start of the next slot, thereby removing the extra proposer boost weight until the next timely block is processed.

Updating checkpoints
    # If a new epoch, pull-up justification and finalization from previous epoch
    if current_slot > previous_slot and compute_slots_since_epoch_start(current_slot) == 0:
        update_checkpoints(store, store.unrealized_justified_checkpoint, store.unrealized_finalized_checkpoint)

If it's the first slot of an epoch, then we have gone through an epoch boundary since the last tick, and our unrealised justification and finalisation have become realised. They should now be in-sync with the justified and finalised checkpoints in the beacon state.

Used by on_tick()
Uses get_current_slot(), compute_slots_since_epoch_start(), update_checkpoints()

on_attestation helpers

validate_target_epoch_against_current_time
def validate_target_epoch_against_current_time(store: Store, attestation: Attestation) -> None:
    target = attestation.data.target

    # Attestations must be from the current or previous epoch
    current_epoch = compute_epoch_at_slot(get_current_slot(store))
    # Use GENESIS_EPOCH for previous when genesis to avoid underflow
    previous_epoch = current_epoch - 1 if current_epoch > GENESIS_EPOCH else GENESIS_EPOCH
    # If attestation target is from a future epoch, delay consideration until the epoch arrives
    assert target.epoch in [current_epoch, previous_epoch]

This function simply checks that an attestation came from the current or previous epoch, based on its target checkpoint vote. The Store has a notion of the current time, maintained by the on_tick() handler, so it's a straightforward calculation. The timeliness check was introduced to defend against the "decoy flip-flop" attack described below.

Note that there is a small inconsistency here. Attestations may be included in blocks only for 32 slots after the slot in which they were published. However, they are valid for consideration in the fork choice for two epochs, which is up to 64 slots.

Used by validate_on_attestation()
Uses get_current_slot(), compute_epoch_at_slot()
validate_on_attestation
def validate_on_attestation(store: Store, attestation: Attestation, is_from_block: bool) -> None:
    target = attestation.data.target

    # If the given attestation is not from a beacon block message, we have to check the target epoch scope.
    if not is_from_block:
        validate_target_epoch_against_current_time(store, attestation)

    # Check that the epoch number and slot number are matching
    assert target.epoch == compute_epoch_at_slot(attestation.data.slot)

    # Attestation target must be for a known block. If target block is unknown, delay consideration until block is found
    assert target.root in store.blocks

    # Attestations must be for a known block. If block is unknown, delay consideration until the block is found
    assert attestation.data.beacon_block_root in store.blocks
    # Attestations must not be for blocks in the future. If not, the attestation should not be considered
    assert store.blocks[attestation.data.beacon_block_root].slot <= attestation.data.slot

    # LMD vote must be consistent with FFG vote target
    target_slot = compute_start_slot_at_epoch(target.epoch)
    assert target.root == get_ancestor(store, attestation.data.beacon_block_root, target_slot)

    # Attestations can only affect the fork choice of subsequent slots.
    # Delay consideration in the fork choice until their slot is in the past.
    assert get_current_slot(store) >= attestation.data.slot + 1

This is a utility function for the on_attestation() handler that collects together the various validity checks we want to perform on an attestation before we make any changes to the Store. Recall that a failed assertion means that the handler will exit and any changes made to the Store must be rolled back.

Attestation timeliness
    # If the given attestation is not from a beacon block message, we have to check the target epoch scope.
    if not is_from_block:
        validate_target_epoch_against_current_time(store, attestation)

First, we check the attestation's timeliness. Newly received attestations are considered for insertion into the Store only if they came from the current or previous epoch at the time when we heard about them.

This check was introduced to defend against a "decoy flip-flop attack" on LMD GHOST. The attack depends on two competing branches having emerged due to some network failure. An adversary with some fraction of the stake (but less than 33%) can store up votes from earlier epochs and release them at carefully timed moments to switch the winning branch (according to the LMD GHOST fork choice) so that neither branch can gain the necessary 2/3 weight for finalisation. The attack can continue until the adversary runs out of stored votes.

Allowing only attestations from the current and previous epoch to be valid for updates to the Store seems to be an effective defence as it prevents the attacker from storing up attestations from previous epochs. The PR implementing this describes it as "FMD GHOST" (fresh message driven GHOST). However, the fork choice still relies on the latest message ("LMD") from each validator in the Store, no matter how old it is. We seem to have ended up with a kind of hybrid FMD/LMD GHOST in practice35.

As for the if not is_from_block test, this allows the processing of old attestations by the on_attestation handler if they were received in a block. It seems to have been introduced to help with test generation rather than being anything required in normal operation. Here's a comment from the PR that introduced it.

Also good to move ahead with processing old attestations from blocks for now - that's the only way to make atomic updates to the store work in our current testing setup. If that changes in the future, this logic should go through security analysis (esp. for flip-flop attacks).

Attestations are valid for inclusion in a block only if they are less than 32 slots old. These will be a subset of the "fresh" votes made at the time (the "current plus previous epoch" criterion for freshness could encompass as many as 64 slots).

Matching epoch and slot
    # Check that the epoch number and slot number are matching
    assert target.epoch == compute_epoch_at_slot(attestation.data.slot)

This check addresses an edge case in which validators could fabricate votes for a prior or subsequent epoch. It's probably not a big issue for the fork choice, more for the beacon chain state transition accounting. Nevertheless, the check was implemented in both places.

No attestations for unknown blocks
    # Attestations target be for a known block. If target block is unknown, delay consideration until the block is found
    assert target.root in store.blocks
    # Attestations must be for a known block. If block is unknown, delay consideration until the block is found
    assert attestation.data.beacon_block_root in store.blocks

This seems like a natural check - if we don't know about a block (either a target checkpoint or the head block), there's no point processing any votes for it. These conditions were added to the spec without further rationale. As noted in the comments, such attestations may become valid in future and should be reconsidered then. When they receive attestations for blocks that they don't yet know about, clients will typically ask their peers to send the block to them directly.

No attestations for future blocks
    # Attestations must not be for blocks in the future. If not, the attestation should not be considered
    assert store.blocks[attestation.data.beacon_block_root].slot <= attestation.data.slot

This check was introduced alongside the above checks for unknown blocks. Allowing votes for blocks that were published later than the attestation's assigned slot increases the feasibility of the decoy flip-flop attack by removing the need to have had a period of network asynchrony to set it up.

LMD and FFG vote consistency
    # LMD vote must be consistent with FFG vote target
    target_slot = compute_start_slot_at_epoch(target.epoch)
    assert target.root == get_ancestor(store, attestation.data.beacon_block_root, target_slot)

This check ensures that the block in the attestation's head vote descends from the block in its target vote.

The check was introduced to fix three issues that had come to light.

  1. Inconsistencies between the fork choice's validation of attestations and the state transition's validation of attestations. The issue is that, if some attestations are valid with respect to the fork choice but invalid for inclusion in blocks, it is a potential source of differing network views between validators, and could impede fork choice convergence. Validators receive attestations both via attestation gossip and via blocks. Ideally, each of these channels will contain more or less the same information.36

  2. Attestations from incompatible forks. Since committee shufflings are decided only at the start of the previous epoch, it can lead to implementation challenges when processing attestations where the target block is from a different fork. After a while, forks end up with different shufflings. Clients often cache shufflings and it can be a source of bugs having to handle these edge cases. This check removes any ambiguity over the state to be used when validating attestations. It also prevents validators exploiting the ability to influence their own committee assignments in the event of multiple forks.

  3. Faulty or malicious validators shouldn't be able to influence fork choice via exploiting this inconsistency. An attestation that fails this test would not have been produced by a correctly operating, honest validator. Therefore it is safest to ignore it.

Only future slots
    # Attestations can only affect the fork choice of subsequent slots.
    # Delay consideration in the fork choice until their slot is in the past.
    assert get_current_slot(store) >= attestation.data.slot + 1

This criterion is discussed in section 8.4 of the Gasper paper: only attestations from slots up to and including slot N1N-1 may appear in the Store at slot NN.

If attestations were included in the Store as soon as being received, an adversary with a number of dishonest validators could use that to probabilistically split the votes of honest validators. The dishonest validators would attest early in the slot, dividing their votes between competing head blocks. Due to network delays, when honest validators run their own fork choices prior to attesting at the proper time, they are likely to see different weights for each of the candidates, based on the subset of dishonest attestations they have received by then. In which case the vote of the honest validators could end up being split. This might keep the chain from converging on a single head block.

Introducing this one slot lag for considering attestations makes it much more likely that honest validators will all vote for the same head block in slot NN, as they will have all seen a similar set of attestations up to slot N1N-1, and cannot be influenced by an adversary's early attestations in the current slot.

Used by on_attestation()
Uses validate_target_epoch_against_current_time(), compute_epoch_at_slot(), compute_start_slot_at_epoch(), get_ancestor(), get_ancestor()
store_target_checkpoint_state
def store_target_checkpoint_state(store: Store, target: Checkpoint) -> None:
    # Store target checkpoint state if not yet seen
    if target not in store.checkpoint_states:
        base_state = copy(store.block_states[target.root])
        if base_state.slot < compute_start_slot_at_epoch(target.epoch):
            process_slots(base_state, compute_start_slot_at_epoch(target.epoch))
        store.checkpoint_states[target] = base_state

We need checkpoint states both to provide validator balances (used for weighting votes in the fork choice) and for the validator shufflings (used when validating attestations).

A Checkpoint is a reference to the first slot of an epoch and are what the Casper FFG votes in attestations point to. When an attestation targets a checkpoint that has empty slots preceding it, the checkpoint's state will not match the state of the block that it points to. Therefore, we take that block's state (base_state) and run the simple process_slots() state transition for empty slots on it to bring the state up to date with the checkpoint.

A diagram showing the state being updated to the checkpoint by process_slots.

Consider a checkpoint that points to [N,B][N, B], where NN is the checkpoint height (epoch number) and BB is the block root of the most recent block. The shapes with dotted outlines indicate skipped slots. The process_slots() function takes the state SS associated with the block and updates it to the slot of the checkpoint by playing empty slots onto it, resulting in state SS'.

Used by on_attestation()
Uses compute_start_slot_at_epoch(), process_slots(),
update_latest_messages
def update_latest_messages(store: Store, attesting_indices: Sequence[ValidatorIndex], attestation: Attestation) -> None:
    target = attestation.data.target
    beacon_block_root = attestation.data.beacon_block_root
    non_equivocating_attesting_indices = [i for i in attesting_indices if i not in store.equivocating_indices]
    for i in non_equivocating_attesting_indices:
        if i not in store.latest_messages or target.epoch > store.latest_messages[i].epoch:
            store.latest_messages[i] = LatestMessage(epoch=target.epoch, root=beacon_block_root)

A message comprises a timestamp and a block root (head) vote. These are extracted from the containing attestation in the form of the epoch number of the target checkpoint of the attestation, and the LMD GHOST head block vote respectively. By the time we get here, validate_on_attestation() has already checked that the slot for which the head vote was made belongs to the epoch corresponding to the target vote. Validators vote exactly once per epoch, so the epoch number is granular enough for tracking their latest votes.

All the validators in attesting_indices made this same attestation. The attestation will have travelled the world as a single aggregate attestation, but it has been unpacked in on_attestation() before being passed to this function. Validators on our naughty list of equivocaters are filtered out, and any that are left are considered for updates.

If the validator index is not yet in the store.latest_messages set then its vote is inserted; if the vote that we have is newer than the vote already stored then it is updated. Each validator has at most one entry in the latest_messages set.

Used by on_attestation()
See also Attestation, LatestMessage

Handlers

The four handlers below – on_tick(), on_block(), on_attestation(), and on_attester_slashing() – are the fork choice rule's four senses. These are the means by which the fork choice gains its knowledge of the outside world, and the only means by which the Store gets updated.

None of the handlers is explicitly called by any code that appears anywhere in the spec. It is expected that client implementations will call each handler as and when required. As per the introductory material at the top of the fork choice spec, they should be called as follows.

  • on_tick(store, time) whenever time > store.time where time is the current Unix time.
  • on_block(store, block) whenever a block block is received.
  • on_attestation(store, attestation) whenever an attestation attestation is received.
  • on_attester_slashing(store, attester_slashing) whenever an attester slashing attester_slashing is received.

on_tick

def on_tick(store: Store, time: uint64) -> None:
    # If the ``store.time`` falls behind, while loop catches up slot by slot
    # to ensure that every previous slot is processed with ``on_tick_per_slot``
    tick_slot = (time - store.genesis_time) // SECONDS_PER_SLOT
    while get_current_slot(store) < tick_slot:
        previous_time = store.genesis_time + (get_current_slot(store) + 1) * SECONDS_PER_SLOT
        on_tick_per_slot(store, previous_time)
    on_tick_per_slot(store, time)

A "tick" is not defined in the specification. Notionally, ticks are used to continually keep the fork choice's internal clock (store.time) updated. In practice, calling on_tick() is only really required at the start of a slot, at SECONDS_PER_SLOT / INTERVALS_PER_SLOT into a slot, and before proposing a block. However, on_tick() processing is light and it can be convenient to call it more often.

The Teku client calls on_tick() regularly twice per second since it is used internally to drive other things beside the fork choice. In addition, Teku uses units of milliseconds rather than seconds for its tick interval, which is strictly speaking off-spec, but is necessary for supporting other chains such as the Gnosis Beacon Chain for which the SECONDS_PER_SLOT is not a multiple of INTERVALS_PER_SLOT.

The while loop was introduced in the Capella specification. It ensures that the processing in on_tick_per_slot() is done every slot. When multiple slots have passed since the last tick was processed, the loop calls on_tick_per_slot() for each of them so as to catch up. The only thing this makes a difference to is updating the checkpoints at epoch boundaries. Previously, if a tick was not processed during the first slot of an epoch, then the checkpoint update could be incorrectly skipped. Note that on_tick_per_slot() updates store.time, which in turn updates the output of get_current_slot(), so the loop will terminate.

I expect that the reason time is provided as a parameter rather than looked up via the machine's clock is that it simplifies testing.

Uses get_current_slot(), on_tick_per_slot()
See also SECONDS_PER_SLOT

on_block

def on_block(store: Store, signed_block: SignedBeaconBlock) -> None:
    block = signed_block.message
    # Parent block must be known
    assert block.parent_root in store.block_states
    # Make a copy of the state to avoid mutability issues
    pre_state = copy(store.block_states[block.parent_root])
    # Blocks cannot be in the future. If they are, their consideration must be delayed until they are in the past.
    assert get_current_slot(store) >= block.slot

    # Check that block is later than the finalized epoch slot (optimization to reduce calls to get_ancestor)
    finalized_slot = compute_start_slot_at_epoch(store.finalized_checkpoint.epoch)
    assert block.slot > finalized_slot
    # Check block is a descendant of the finalized block at the checkpoint finalized slot
    assert get_ancestor(store, block.parent_root, finalized_slot) == store.finalized_checkpoint.root

    # Check the block is valid and compute the post-state
    state = pre_state.copy()
    block_root = hash_tree_root(block)
    state_transition(state, signed_block, True)
    # Add new block to the store
    store.blocks[block_root] = block
    # Add new state for this block to the store
    store.block_states[block_root] = state

    # Add proposer score boost if the block is timely
    time_into_slot = (store.time - store.genesis_time) % SECONDS_PER_SLOT
    is_before_attesting_interval = time_into_slot < SECONDS_PER_SLOT // INTERVALS_PER_SLOT
    if get_current_slot(store) == block.slot and is_before_attesting_interval:
        store.proposer_boost_root = hash_tree_root(block)

    # Update checkpoints in store if necessary
    update_checkpoints(store, state.current_justified_checkpoint, state.finalized_checkpoint)

    # Eagerly compute unrealized justification and finality
    compute_pulled_up_tip(store, block_root)

The on_block() handler should be called whenever a new signed beacon block is received. It does the following.

  • Perform some validity checks:
  • Update the store with the block and its associated beacon state.
  • Handle proposer boost (block timeliness).
  • Update the Store's justified and finalised checkpoints if permitted and required.

The on_block() handler does not call the on_attestation() handler for the attestations it contains, so clients need to do that separately for each attestation.

Validity checks
    # Parent block must be known
    assert block.parent_root in store.block_states
    # Make a copy of the state to avoid mutability issues
    pre_state = copy(store.block_states[block.parent_root])
    # Blocks cannot be in the future. If they are, their consideration must be delayed until they are in the past.
    assert get_current_slot(store) >= block.slot

    # Check that block is later than the finalized epoch slot (optimization to reduce calls to get_ancestor)
    finalized_slot = compute_start_slot_at_epoch(store.finalized_checkpoint.epoch)
    assert block.slot > finalized_slot
    # Check block is a descendant of the finalized block at the checkpoint finalized slot
    assert get_ancestor(store, block.parent_root, finalized_slot) == store.finalized_checkpoint.root

    # Check the block is valid and compute the post-state
    state = pre_state.copy()
    block_root = hash_tree_root(block)
    state_transition(state, signed_block, True)

First we do some fairly self-explanatory checks. In order to be considered in the fork choice, the block must be joined to the block tree that we already have (that is, its parent must be in the Store), it must not be from a future slot according to our Store's clock, and it must be from a branch that descends from our finalised checkpoint. By the definition of finalised, all prior branches from the canonical chain are pruned away.

The final check is to run a full state transition on the block. This has two purposes, (1) it checks that the block is valid with respect to the consensus rules, and (2) it gives us the block's post-state which we need to add to the Store. We got the block's pre-state from its parent, which we know is already in the store. The True parameter to state_transition() ensures that the block's signature is checked, and that the result of applying the block to the state results in the same state root that the block claims it does (the "post-states" must match). Clients will be running this operation elsewhere when performing the state transition, so it is likely that the result of the state_transition() call will be cached somewhere in an optimal implementation.

Update the Store
    # Add new block to the store
    store.blocks[block_root] = block
    # Add new state for this block to the store
    store.block_states[block_root] = state

Once the block has passed the validity checks, it and its post-state can be added to the Store.

Handle proposer boost
    # Add proposer score boost if the block is timely
    time_into_slot = (store.time - store.genesis_time) % SECONDS_PER_SLOT
    is_before_attesting_interval = time_into_slot < SECONDS_PER_SLOT // INTERVALS_PER_SLOT
    if get_current_slot(store) == block.slot and is_before_attesting_interval:
        store.proposer_boost_root = hash_tree_root(block)

Proposer boost is a defence against balancing attacks on LMD GHOST. It rewards timely blocks with extra weight in the fork choice, making it unlikely that an honest proposer's block will become orphaned.

Here, in the on_block() handler, is where the block's timeliness is assessed and recorded. If the Store's time (as set by the on_tick() handler) is within the first third of the slot (1 / INTERVALS_PER_SLOT, that is, 4 seconds) when the block is processed, then we set store.proposer_boost_root to the block's root.

The store.proposer_boost_root field can only be set during the first four seconds of a slot, and it is cleared at the start of the next slot by the on_tick() handler. It is used in the get_weight() function to determine whether to add the extra proposer boost weight or not.

Note that, if there is a proposer equivocation in the slot, this code will apply proposer boost to the second block received rather than to the first block received. This becomes important for the security of third-party block production with MEV-Boost - it can allow a proposer to "steal" the transactions in a block builder's block (at the cost of getting slashed), which is deemed to be a Bad Thing. It would be better to apply the proposer boost only to the first block received, and a small patch to on_block() has been proposed to implement this.

Update justified and finalised
    # Update checkpoints in store if necessary
    update_checkpoints(store, state.current_justified_checkpoint, state.finalized_checkpoint)

    # Eagerly compute unrealized justification and finality
    compute_pulled_up_tip(store, block_root)

update_checkpoints() simply updates the Store's justified and finalised checkpoints if those in the block's post-state are better (that is, higher, more recent). The Store always tracks the best known justified and finalised checkpoints that it is able to validate.

compute_pulled_up_tip() runs the epoch transition Casper FFG accounting on the block – notionally "pulling it up" from its current slot to the first slot of the next epoch – to see if it has achieved unrealised justification. The block's unrealised justification will be stored for later use by filter_block_tree(), and the Store's unrealised justification and unrealised finalisation trackers may be updated. If the block is from a previous epoch, then the unrealised checkpoints become realised, and update_checkpoints() will be called again, potentially over-writing the update we just made in the line above.

Uses get_current_slot(), compute_start_slot_at_epoch(), get_ancestor(), hash_tree_root(), state_transition(), update_checkpoints(), compute_pulled_up_tip()
See also INTERVALS_PER_SLOT

on_attestation

def on_attestation(store: Store, attestation: Attestation, is_from_block: bool=False) -> None:
    """
    Run ``on_attestation`` upon receiving a new ``attestation`` from either within a block or directly on the wire.

    An ``attestation`` that is asserted as invalid may be valid at a later time,
    consider scheduling it for later processing in such case.
    """
    validate_on_attestation(store, attestation, is_from_block)

    store_target_checkpoint_state(store, attestation.data.target)

    # Get state at the `target` to fully validate attestation
    target_state = store.checkpoint_states[attestation.data.target]
    indexed_attestation = get_indexed_attestation(target_state, attestation)
    assert is_valid_indexed_attestation(target_state, indexed_attestation)

    # Update latest messages for attesting indices
    update_latest_messages(store, indexed_attestation.attesting_indices, attestation)

Attestations may be useful no matter how we heard about them: they might have been contained in a block, or been received individually via gossip, or via a carrier pigeon37.

If the attestation was unpacked from a block then the flag is_from_block should be set to True. This causes the timeliness check in validate_on_attestation() to be skipped: attestations not from blocks must be received in the epoch they were produced in, or the next epoch, in order to influence the fork choice. (So, a carrier pigeon would need to be fairly swift.)

The validate_on_attestation() function performs a comprehensive set of validity checks on the attestation to defend against various attacks.

Assuming that the attestation passes the checks, we add its target checkpoint state to the Store for later use, as well as using it immediately. The store_target_checkpoint_state() function is idempotent, so nothing happens if the state is already present.

Having the target checkpoint state, we can use it to look up the correct shuffling for the validators. With the shuffling in hand, calling get_indexed_attestation() turns the Attestation object (containing a bitlist) into an IndexedAttestation object (containing a list of validator indices).

Finally, we can validate the indexed attestation with is_valid_indexed_attestation(), which amounts to checking its aggregate BLS signature against the set of public keys of this indexed validators. Checking the signatures is relatively expensive compared with the other checks, which is one reason for deferring it to last (we also don't want to be checking them against an inconsistent target).

If, and only if, everything has succeeded, we call update_latest_messages() to refresh the Store's list of latest messages for the validators that participated in this vote.

Uses validate_on_attestation(), store_target_checkpoint_state(), get_indexed_attestation(), is_valid_indexed_attestation(), update_latest_messages()

on_attester_slashing

Note: on_attester_slashing should be called while syncing and a client MUST maintain the equivocation set of AttesterSlashings from at least the latest finalized checkpoint.

def on_attester_slashing(store: Store, attester_slashing: AttesterSlashing) -> None:
    """
    Run ``on_attester_slashing`` immediately upon receiving a new ``AttesterSlashing``
    from either within a block or directly on the wire.
    """
    attestation_1 = attester_slashing.attestation_1
    attestation_2 = attester_slashing.attestation_2
    assert is_slashable_attestation_data(attestation_1.data, attestation_2.data)
    state = store.block_states[store.justified_checkpoint.root]
    assert is_valid_indexed_attestation(state, attestation_1)
    assert is_valid_indexed_attestation(state, attestation_2)

    indices = set(attestation_1.attesting_indices).intersection(attestation_2.attesting_indices)
    for index in indices:
        store.equivocating_indices.add(index)

The on_attester_slashing() handler was added to defend against the equivocation balancing attack (described more formally in Two Attacks On Proof-of-Stake GHOST/Ethereum). The attack relies on the adversary's validators equivocating about their attestations – that is, publishing multiple different attestations per epoch – and is not solved by proposer score boosting.

Of course, the equivocating attestations are slashable under the Casper FFG commandments. When the attack finally ends, those validators will be punished and ejected from the validator set. Meanwhile, however, since the fork choice calculations are based on the validator set at the last justified epoch, the adversary's validators could keep the attack going indefinitely.

Rather than add a lot of apparatus within the fork choice to track and detect conflicting attestations, the mechanism relies on third-party slashing claims received via blocks or directly from peers as attester slashing messages. The validity checks are identical to those in the state-transition's process_attester_slashing() method, including the use of is_slashable_attestation_data(). This is broader that we need for our purposes here as it will exclude validators that make surround votes as well as validators that equivocate. But excluding all misbehaving validators is probably a good idea.

Any validators proven to have made conflicting attestations are added to the store.equivocating_indices set38. They will no longer be involved in calculating the weight of branches, and their future attestations will be ignored in the fork choice. We are permitted to clear any equivocating attestation information from before the last finalised checkpoint, but those validators would have been slashed by the state-transition by then, so this ban is permanent.

Uses is_slashable_attestation_data(), is_valid_indexed_attestation()
See also AttesterSlashing, process_attester_slashing()

Bellatrix Fork Choice

Introduction

This section covers the additional Bellatrix fork choice document, v1.3.0. For a complementary take, see Vitalik's annotated Bellatrix fork choice (based on a slightly older version).

As usual, text with a side bar is quoted directly from the specification.

This is the modification of the fork choice according to the executable beacon chain proposal.

Note: It introduces the process of transition from the last PoW block to the first PoS block.

The "executable beacon chain proposal"39 is what became known as The Merge, and is specified by EIP-3675 together with the Bellatrix upgrade on the beacon chain.

Upgrades to Ethereum's protocol are normally planned to take place at pre-determined block heights. For security reasons, the Merge upgrade used a different trigger, specifically a terminal total difficulty of proof of work mining. The first proof of work block to reach that amount of accumulated difficulty became the last proof of work block: all subsequent execution blocks are now merged into the proof of stake beacon chain as execution payloads.

The only functional change to the fork choice that the Bellatrix upgrade introduced was about ensuring that a valid terminal proof of work block was picked up by the beacon chain at the point of the Merge. As such, this section is largely of only historical interest now.

The remainder of the material in this section (mostly Engine API related) isn't really relevant to the fork choice rule at all. It mainly describes one-way communication of fork choice decisions to the execution layer. Altogether, it's a bit of a weird collection of stuff, for want of a better place to put it I suppose.

Custom types

Name SSZ equivalent Description
PayloadId Bytes8 Identifier of a payload building process

PayloadId is used to keep track of stateful requests from the consensus client to the execution client. Specifically, the consensus client can ask the execution client to start creating a new execution payload via the notify_forkchoice_updated() command (which maps to the engine_forkchoiceUpdatedV1 RPC method in the Engine API docs). The execution client will return a PayloadId reference and continue to build the payload asynchronously. Later, the consensus client can obtain the payload with a call to the engine API's engine_getPayloadV1 method by passing it the same PayloadId.

Protocols

ExecutionEngine

Note: The notify_forkchoice_updated function is added to the ExecutionEngine protocol to signal the fork choice updates.

The body of this function is implementation dependent. The Engine API may be used to implement it with an external execution engine.

Post-Merge, every consensus client (beacon chain client) must be paired up with an execution client (ExecutionEngine; formerly, Eth1 client). The execution client has several roles.

  1. It validates execution payloads.
  2. It executes execution payloads in order to maintain Ethereum's state (accounts, contracts, balances, receipts, etc.).
  3. It provides data to applications via its RPC API.
  4. It maintains a mempool of transactions from which it builds execution payloads and provides them to the consensus layer for distribution.

The first and the last of these are the ones that interest us on the consensus side. The first role is important because beacon blocks are valid only if they contain valid execution payloads. The last is important because the consensus side does not directly handle ordinary Ethereum transactions and cannot build its own execution payloads.

The interface between the two sides is called the Engine API. The Engine API is the RPC (remote procedure call) interface that the execution client provides to its companion consensus client. It is one-way in the sense that the consensus client can call methods on the Engine API, but the execution client does not call any methods on the consensus client.

The most interesting methods that the Engine API provides are these three.

  • engine_newPayloadV1
    • When the consensus client receives a new beacon block, it extracts the block's execution payload and uses this method to send it to the execution client. The execution client will validate the payload and execute the transactions it contains. The method's return value indicates whether the payload was valid or not.
  • engine_forkchoiceUpdatedV1
    • The function below, notify_forkchoice_updated(), uses this method for two purposes. First, it is used routinely to update the execution client with the latest consensus information: head block, safe head block, and finalised block. Second, it can be used to prompt the execution client to begin building an execution payload from its mempool. The consensus client will do this when it is about to propose a beacon block.
  • engine_getPayloadV1
    • This is used to retrieve an execution payload previously requested via engine_forkchoiceUpdatedV1, using a PayloadId as a reference.
notify_forkchoice_updated

This function performs three actions atomically:

  • Re-organizes the execution payload chain and corresponding state to make head_block_hash the head.
  • Updates safe block hash with the value provided by safe_block_hash parameter.
  • Applies finality to the execution state: it irreversibly persists the chain of all execution payloads and corresponding state, up to and including finalized_block_hash.

Additionally, if payload_attributes is provided, this function sets in motion a payload build process on top of head_block_hash and returns an identifier of initiated process.

def notify_forkchoice_updated(self: ExecutionEngine,
                              head_block_hash: Hash32,
                              safe_block_hash: Hash32,
                              finalized_block_hash: Hash32,
                              payload_attributes: Optional[PayloadAttributes]) -> Optional[PayloadId]:
    ...

This is a wrapper around the Engine API's engine_forkchoiceUpdatedV1 RPC method as described above. We use it to keep the execution client up to date with the latest fork choice information, and (optionally) from time to time to request it to build a new execution payload for us.

Note: The (head_block_hash, finalized_block_hash) values of the notify_forkchoice_updated function call maps on the POS_FORKCHOICE_UPDATED event defined in the EIP-3675. As per EIP-3675, before a post-transition block is finalized, notify_forkchoice_updated MUST be called with finalized_block_hash = Hash32().

EIP-3675 is the specification of the Merge on the execution layer side (Eth1 side) of things. The POS_FORKCHOICE_UPDATED event described there is triggered by the consensus layer calling the Engine API's engine_forkchoiceUpdatedV1 method, which is in turn triggered by the consensus client calling notify_forkchoice_updated(). The consensus client will do this periodically, in particular whenever a reorg occurs on the beacon chain so that applications built on the execution layer can know which state is current.

Between the Merge and the first finalised epoch after the Merge there was no guarantee of finality on the execution chain, therefore we could not sent it a finalised block hash and had to use the placeholder default value instead.

Note: Client software MUST NOT call this function until the transition conditions are met on the PoW network, i.e. there exists a block for which is_valid_terminal_pow_block function returns True.

The proof of work chain was not interested in the proof of stake chain's view of the world until after the Merge.

Note: Client software MUST call this function to initiate the payload build process to produce the merge transition block; the head_block_hash parameter MUST be set to the hash of a terminal PoW block in this case.

The first beacon chain proposer after the terminal proof of work block had been detected would call notify_forkchoice_updated() with the payload_attributes parameter in order to request an execution payload to be build for the first merged block.

If there had been multiple candidate terminal PoW blocks (as there were for the Goerli testnet Merge), the beacon block proposer would have been free to choose which of them to ask its execution client to build on.

safe_block_hash

The safe_block_hash parameter MUST be set to return value of get_safe_execution_payload_hash(store: Store) function.

The "safe block" feature is a way for the consensus protocol to signal to the execution layer that a block is very unlikely ever to be reverted. Application developers could use the safe block information to provide better user experience to their users in the form of a pseudo fast-finality. See the later Safe Block section for more on this.

Helpers

PayloadAttributes

Used to signal to initiate the payload build process via notify_forkchoice_updated.

@dataclass
class PayloadAttributes(object):
    timestamp: uint64
    prev_randao: Bytes32
    suggested_fee_recipient: ExecutionAddress
    withdrawals: Sequence[Withdrawal]  # [New in Capella]

This class maps onto the Engine API's PayloadAttributesV2 class and is used when asking the execution client to start building an execution payload.

The prev_randao field is the beacon state's current RANDAO value, having been updated by the RANDAO reveal in the previous beacon block. It is made available to execution layer applications via the EVM's new PREVRANDAO opcode.

suggested_fee_recipient is the Ethereum account that any fee income from transaction tips should be sent to when the payload is executed (formerly known as the COINBASE). The execution client may override this if it has its own setting for fee recipient, hence "suggested". But allowing it to be set via the Engine API makes it possible for a beacon node hosting multiple validators to use a different fee recipient address for each validator, whereas setting it on the execution side would force them all to use the same fee recipient address.

The withdrawals field was added in the Capella upgrade. It allows the consensus layer to pass a list of withdrawals to the execution layer to include in an execution payload. There will be at most MAX_WITHDRAWALS_PER_PAYLOAD of them. When the block containing the payload is processed, for each withdrawal, the amount will be deducted from validator's balance on the beacon chain, and will be added to the balance of the Ethereum account in the Withdrawal object's ExecutionAddress field. The ExecutionAddress is derived from the validator's withdrawal credentials.

PowBlock

class PowBlock(Container):
    block_hash: Hash32
    parent_hash: Hash32
    total_difficulty: uint256

This class is just a succinct way to wrap up the information we need for checking proof of work blocks around the Merge. It is returned by get_pow_block() and consumed by is_valid_terminal_pow_block().

get_pow_block

Let get_pow_block(block_hash: Hash32) -> Optional[PowBlock] be the function that given the hash of the PoW block returns its data. It may result in None if the requested block is not yet available.

Note: The eth_getBlockByHash JSON-RPC method may be used to pull this information from an execution client.

As noted, get_pow_block() is a wrapper around Ethereum's eth_getBlockByHash JSON-RPC method. Given a block hash (not its hash tree root! - Eth1 blocks are encoded with RLP rather than SSZ), it returns the information in the PowBlock structure.

eth_getBlockByHash is a standard Eth1 client RPC method rather than a specific Engine API method. For convenience, execution clients often provide access to this method via the Engine API port in addition to the standard RPC API port so that consensus clients can be configured to connect to only one port on the execution client.

is_valid_terminal_pow_block

Used by fork-choice handler, on_block.

def is_valid_terminal_pow_block(block: PowBlock, parent: PowBlock) -> bool:
    is_total_difficulty_reached = block.total_difficulty >= TERMINAL_TOTAL_DIFFICULTY
    is_parent_total_difficulty_valid = parent.total_difficulty < TERMINAL_TOTAL_DIFFICULTY
    return is_total_difficulty_reached and is_parent_total_difficulty_valid

Given two PowBlock objects (corresponding to a proof of work block and its parent proof of work block respectively), this function checks whether the block meets the criteria for being the terminal proof of work block. That is, that its total difficulty exceeds the terminal total difficulty and that its parent's total difficulty does not.

validate_merge_block

def validate_merge_block(block: BeaconBlock) -> None:
    """
    Check the parent PoW block of execution payload is a valid terminal PoW block.

    Note: Unavailable PoW block(s) may later become available,
    and a client software MAY delay a call to ``validate_merge_block``
    until the PoW block(s) become available.
    """
    if TERMINAL_BLOCK_HASH != Hash32():
        # If `TERMINAL_BLOCK_HASH` is used as an override, the activation epoch must be reached.
        assert compute_epoch_at_slot(block.slot) >= TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH
        assert block.body.execution_payload.parent_hash == TERMINAL_BLOCK_HASH
        return

    pow_block = get_pow_block(block.body.execution_payload.parent_hash)
    # Check if `pow_block` is available
    assert pow_block is not None
    pow_parent = get_pow_block(pow_block.parent_hash)
    # Check if `pow_parent` is available
    assert pow_parent is not None
    # Check if `pow_block` is a valid terminal PoW block
    assert is_valid_terminal_pow_block(pow_block, pow_parent)

This is used by the Bellatrix on_block() handler. The block parameter is a beacon block that claims to be the first merged block. That is, it is the first beacon block (on the current branch) to contain a non-default ExecutionPayload.

The TERMINAL_BLOCK_HASH is a parameter that client operators could have agreed to use to override the terminal total difficulty mechanism if necessary. For example, if the Merge had resulted in beacon chain forks they could have been resolved by manually agreeing an Eth1 Merge block and setting TERMINAL_BLOCK_HASH to its value via client command line parameters. In the event, this was not needed and TERMINAL_BLOCK_HASH remains at its default value of Hash32().

The remainder of the function checks, (a) that the PoW block that's the parent of the execution payload exists, and has total difficulty greater than the TERMINAL_TOTAL_DIFFICULTY, and (b) that the parent of that block exists and has a total difficulty less than the TERMINAL_TOTAL_DIFFICULTY. (The difficulty checks are performed in is_valid_terminal_pow_block().)

A diagram showing the relationship between the merge block and the terminal proof of work block.

The first beacon chain merged block contains the execution payload whose parent PoW block was the terminal PoW block. The terminal PoW block is the first PoW block to have a total difficulty exceeding the TERMINAL_TOTAL_DIFFICULTY.

The parent and grandparent PoW blocks are retrieved via the get_pow_block() function, which in practice involves making RPC calls to the attached Eth1/execution client. If either of these calls fails, an assert will be triggered, and the on_block() handler will bail out without making any changes.

Updated fork-choice handlers

on_block

Note: The only modification is the addition of the verification of transition block conditions.

def on_block(store: Store, signed_block: SignedBeaconBlock) -> None:
    """
    Run ``on_block`` upon receiving a new block.

    A block that is asserted as invalid due to unavailable PoW block may be valid at a later time,
    consider scheduling it for later processing in such case.
    """
    block = signed_block.message
    # Parent block must be known
    assert block.parent_root in store.block_states
    # Make a copy of the state to avoid mutability issues
    pre_state = copy(store.block_states[block.parent_root])
    # Blocks cannot be in the future. If they are, their consideration must be delayed until they are in the past.
    assert get_current_slot(store) >= block.slot

    # Check that block is later than the finalized epoch slot (optimization to reduce calls to get_ancestor)
    finalized_slot = compute_start_slot_at_epoch(store.finalized_checkpoint.epoch)
    assert block.slot > finalized_slot
    # Check block is a descendant of the finalized block at the checkpoint finalized slot
    assert get_ancestor(store, block.parent_root, finalized_slot) == store.finalized_checkpoint.root

    # Check the block is valid and compute the post-state
    state = pre_state.copy()
    block_root = hash_tree_root(block)
    state_transition(state, signed_block, True)

    # [New in Bellatrix]
    if is_merge_transition_block(pre_state, block.body):
        validate_merge_block(block)

    # Add new block to the store
    store.blocks[block_root] = block
    # Add new state for this block to the store
    store.block_states[block_root] = state

    # Add proposer score boost if the block is timely
    time_into_slot = (store.time - store.genesis_time) % SECONDS_PER_SLOT
    is_before_attesting_interval = time_into_slot < SECONDS_PER_SLOT // INTERVALS_PER_SLOT
    if get_current_slot(store) == block.slot and is_before_attesting_interval:
        store.proposer_boost_root = hash_tree_root(block)

    # Update checkpoints in store if necessary
    update_checkpoints(store, state.current_justified_checkpoint, state.finalized_checkpoint)

    # Eagerly compute unrealized justification and finality.
    compute_pulled_up_tip(store, block_root)

As noted, the only addition here to the normal on_block() handler is the lines,

    # [New in Bellatrix]
    if is_merge_transition_block(pre_state, block.body):
        validate_merge_block(block)

The is_merge_transition_block() function will return True when the given block is the first beacon block that contains an execution payload, and False otherwise.

To ensure consistency between the execution chain and the beacon chain at the Merge, this first merged beacon block requires some extra processing. We must check that the PoW block its execution payload is derived from has indeed met the criteria for the merge. Essentially, its total difficulty must exceed the terminal total difficulty and its parent's total difficulty must not. If this test fails then something has gone wrong and the beacon block must be excluded from the fork choice.

There might be several candidate execution blocks that meet this criterion in the event of PoW forks at the point of the Merge – this occurred when merging one of the testnets40 – but that's fine. The proposer of the first merged beacon block41 that becomes canonical gets to decide which terminal execution block wins.

Safe Block

Introduction

The Fork Choice Safe Block spec is not really part of the beacon chain's fork choice and is located in a different document in the consensus repo. It is an heuristic for using the fork choice's Store data to identify a block that will not be reverted, under some reasonable assumptions. It could be used, for example, by applications to implement a settlement period for transactions. There is an analogy with the assumption that, under proof of work, in the absence of a 51% attack, a block becomes safe from reorgs after a certain number of blocks (say, fifteen) have been built on top of it.

Under honest majority and certain network synchronicity assumptions there exist a block that is safe from re-orgs. Normally this block is pretty close to the head of canonical chain which makes it valuable to expose a safe block to users.

This section describes an algorithm to find a safe block.

Of course, the ultimate safe block is the last finalised checkpoint. But that could be several minutes in the past, even under ideal network conditions. If we assume (a) that there is an honest majority of validators, and (b) that their messages are received in a timely fashion, then we can in principle identify a more recent block that will not be at risk of reversion.

get_safe_beacon_block_root

def get_safe_beacon_block_root(store: Store) -> Root:
    # Use most recent justified block as a stopgap
    return store.justified_checkpoint.root

Note: Currently safe block algorithm simply returns store.justified_checkpoint.root and is meant to be improved in the future.

Returning the justified checkpoint is certainly safe under the assumptions above, but we can almost certainly do better. Substantial progress has been made recently towards providing a more useful safe block. There's more on this in the Confirmation rule section of the Consensus chapter.

get_safe_execution_payload_hash

def get_safe_execution_payload_hash(store: Store) -> Hash32:
    safe_block_root = get_safe_beacon_block_root(store)
    safe_block = store.blocks[safe_block_root]

    # Return Hash32() if no payload is yet justified
    if compute_epoch_at_slot(safe_block.slot) >= BELLATRIX_FORK_EPOCH:
        return safe_block.body.execution_payload.block_hash
    else:
        return Hash32()

Note: This helper uses beacon block container extended in Bellatrix.

Bellatrix was the pre-Merge upgrade that added the execution payload hash to beacon blocks in readiness for the Merge itself. Applications on Ethereum are largely unaware of the beacon chain and will use the execution payload hash rather than the beacon block root as their reference point in the Eth1 blockchain.


  1. A process called "Justinification". Iykyk ;-)

  2. You can still find old versions, as described in the Preface.

  3. Ethereum 1.0 introduced a fork identifier as defined in EIP-2124 which is similar to Version, but the Eth1 fork ID is not part of the consensus protocol and is used only in the networking protocol.

  4. See Issue 2390 for discussion and a rationale for the current categorisation into constants, presets, and configuration variables.

  5. Fun fact: at the point of the Capella upgrade, out of 567,144 total validators, 322,491 (56.9%) had 0x00 BLS withdrawal credentials and 244,653 (43.1%) had 0x01 Eth1 withdrawal credentials.

  6. It's a blockchain, yo!

  7. This is intended to be a permalink to the Yellow Paper's "Berlin" edition, a pre-Merge version of the YP. At the time of writing, the YP has not been updated for the "London" upgrade and is therefore missing the EIP-1559 field base_fee_per_gas.

  8. This document does not have the full force of an IETF standard. For one thing, it remains a draft (that is now expired), for another it is an IRTF document, meaning that it is from a research group rather than being on the IETF standards track. Some context from Brian Carpenter, former IETF chair,

    I gather that you are referring to an issue in draft-irtf-cfrg-bls-signature-04. That is not even an IETF draft; it's an IRTF draft, apparently being discussed in an IRTF Research Group. So it is not even remotely under consideration to become an IETF standard...

  9. I'd have preferred not adding the one there, and using <, here. But it is what it is.

  10. Also not immediately obvious is that there is a subtle issue with committee sizes that was discovered by formal verification, although, given the max supply of ETH it will never be triggered.

  11. There is some discussion around changing this to make voluntary exit messages fork-agnostic in future, but that has not yet been implemented.

  12. From a conversation on the Ethereum Research Discord server.

  13. Unfortunately, the original page, https://hackingresear.ch/discouragement-attacks/ seems to be unavailable now. The link in the text is to archive.org, but their version is a bit broken.

  14. Worth a visit if only to have a chuckle at Jacek's description of uints as "ugly integers".

  15. The use of zip() here is quite Pythonic, but just means that with two lists of equal length we take their elements pairwise in turn.

  16. This is due to change in EIP-7045, scheduled for inclusion in the Deneb upgrade. The change will allow attestations to be included from the whole of current and previous epochs.

  17. EIP-6110 is a potential future upgrade that would allow deposits to be processed more or less instantly, rather than having to go through the Eth1 follow distance and Eth1 voting period as they do now.

  18. I'm simplifying here. LMD GHOST can only consider descendants of the last justified checkpoint at any one time. But the last justified checkpoint can change. LMD GHOST will never consider branches from before the last finalised checkpoint. More on this later.

  19. Appendix C.1 of the Goldfish, "No More Attacks on Proof-of-Stake Ethereum?" paper is a useful overview of known weaknesses of Gasper consensus.

  20. If I've understood correctly. Traces of IMD GHOST are difficult to find these days, and that's probably for the better.

  21. When issues are found with the fork choice it is common for them to be "silently" fixed in client releases before being made public. Paradoxically, changes to the fork choice rule are not consensus breaking and do not usually require simultaneous activation in clients. Hard fork upgrades such as Capella ensure that all node operators have, of necessity, upgraded to the latest software versions, at which point it is safe to publish details of the problems and fixes. This is one good reason for keeping your software up to date, even between mandatory upgrades.

  22. "Ex-post" reorgs occur when a proposer orphans the block in the previous slot by building on an ancestor. "Ex-ante" reorgs occur when a proposer arranges to orphan the next block by submitting its own proposal late. Caspar Schwarz-Schilling made a nice Twitter thread explainer.

  23. It's interesting to see some explanation finding its way back into the spec documents now, after it was all diligently stripped out a while ago. Turns out that people appreciate explanations, I guess.

  24. "Greedy Heaviest-Observed Sub-Tree", named by Sompolinsky and Zohar.

  25. For example, blocks in slot 4939809 and slot 4939815 had almost no votes and yet became canonical. They were almost certainly published late – apparently by the same operator, Legend – but in time for the next proposer to build on them. The late publishing may have been due to a simple clock misconfiguration, or it may have been a deliberate strategy to gain more transaction income post-merge. In either case, it is undesirable.

  26. View-merge, though not by that name, was first proposed for Ethereum in October 2021 in the Ethresear.ch post, Change fork choice rule to mitigate balancing and reorging attacks. See also this Twitter thread for more explanation of view-merge.

  27. To find the section 6.3 that this quote refers to, you need to see the original v1 version of the Goldfish paper. That section is omitted from the later version of the paper.

  28. This scenario doesn't strictly break Casper FFG's "plausible liveness" property as, in principle, voters can safely ignore the LMD GHOST fork choice and switch back to the original chain in order to advance finality. But it does create a conflict between the LMD GHOST fork choice rule and advancing finality.

  29. Some evidence for how challenging the fork choice is to reason about is provided by the formal proofs of correctness created by Roberto Saltini and team at Consensys. One of the proofs alone is 28 pages long when printed.

  30. I am grateful to Mikhail Kalinin for helping me with his very lucid and careful explanation. My explanation is based on his; any errors and over-simplifications are all my own work.

  31. However, there remain some assumptions in the proofs that are over-simplified, such as, "No bound on the amount of attestations that can be included in a block", and, "Honest nodes do not discard any attestation that they receive regardless of how old it is". There is some history of failed proofs around the fork choice that were based on assumptions that were too broad. Hopefully these will stand; I am not equipped to judge.

  32. The algorithm is recursive, although it is not written recursively here.

  33. See section 4.6 of that paper.

  34. Changing time from seconds to slots in the fork choice has been suggested, but never adopted.

  35. FMD vs LMD GHOST is discussed further in the Ethresear.ch article, Saving strategy and FMD GHOST. Later work, such as the Goldfish protocol and RLMD GHOST, take vote-expiry further.

  36. One such inconsistency remains: attestations are valid in gossip for up to two epochs, but for only 32 slots in blocks.

  37. This would change were we to adopt view-merge. Only attestations that had been processed by specifically designated aggregators would be considered in the fork choice.

  38. store.equivocating_indices is a Python Set. Adding an existing element again is a no-op, so it cannot grow without bounds.

  39. This name comes from Mikhail Kalinin's original article on Ethresear.ch.

  40. And triggered an issue with some client implementations.

  41. For the record, the first merged beacon block on mainnet was at slot 4700013.

Created by Ben Edgington. Licensed under CC BY-SA 4.0. Published 2023-09-29 14:16 UTC. Commit ebfcf50.