Part 3: Annotated Specification

Containers

Misc dependencies

`Fork`

class Fork(Container):
    previous_version: Version
    current_version: Version
    epoch: Epoch  # Epoch of latest fork

Fork data is stored in the BeaconState to indicate the current and previous fork versions. The fork version gets incorporated into the cryptographic domain in order to invalidate messages from validators on other forks. The previous fork version and the epoch of the change are stored so that pre-fork messages can still be validated (at least until the next fork). This ensures continuity of attestations across fork boundaries.

Note that this is all about planned protocol forks (upgrades), and nothing to do with the fork-choice rule, or inadvertent forks due to errors in the state transition.

`ForkData`

class ForkData(Container):
    current_version: Version
    genesis_validators_root: Root

ForkData is used only in compute_fork_data_root(). This is used when distinguishing between chains for the purpose of peer-to-peer gossip, and for domain separation. By including both the current fork version and the genesis validators root, we can cleanly distinguish between, say, mainnet and a testnet. Even if they have the same fork history, the genesis validators roots will differ.

Version is the datatype for a fork version number.

`Checkpoint`

class Checkpoint(Container):
    epoch: Epoch
    root: Root

Checkpoints are the points of justification and finalisation used by the Casper FFG protocol. Validators use them to create AttestationData votes, and the status of recent checkpoints is recorded in BeaconState.

As per the Casper paper, checkpoints contain a height, and a block root. In this implementation of Casper FFG, checkpoints occur whenever the slot number is a multiple of SLOTS_PER_EPOCH, thus they correspond to epoch numbers. In particular, checkpoint $N$ is the first slot of epoch $N$ . The genesis block is checkpoint 0, and starts off both justified and finalised.

Thus, the root element here is the block root of the first block in the epoch. (This might be the block root of an earlier block if some slots have been skipped, that is, if there are no blocks for those slots.).

It is very common to talk about justifying and finalising epochs. This is not strictly correct: checkpoints are justified and finalised.

Once a checkpoint has been finalised, the slot it points to and all prior slots will never be reverted.

`Validator`

class Validator(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32  # Commitment to pubkey for withdrawals
    effective_balance: Gwei  # Balance at stake
    slashed: boolean
    # Status epochs
    activation_eligibility_epoch: Epoch  # When criteria for activation were met
    activation_epoch: Epoch
    exit_epoch: Epoch
    withdrawable_epoch: Epoch  # When validator can withdraw funds

This is the data structure that stores most of the information about an individual validator, with only validators' balances and inactivity scores stored elsewhere.

Validators' actual balances are stored separately in the BeaconState structure, and only the slowly changing "effective balance" is stored here. This is because actual balances are liable to change quite frequently (every epoch): the Merkleization process used to calculate state roots means that only the parts that change need to be recalculated; the roots of unchanged parts can be cached. Separating out the validator balances potentially means that only 1/15th (8/121) as much data needs to be rehashed every epoch compared to storing them here, which is an important optimisation.

For similar reasons, validators' inactivity scores are stored outside validator records as well, as they are also updated every epoch.

A validator's record is created when its deposit is first processed. Sending multiple deposits does not create multiple validator records: deposits with the same public key are aggregated in one record. Validator records never expire; they are stored permanently, even after the validator has exited the system. Thus there is a 1:1 mapping between a validator's index in the list and the identity of the validator (validator records are only ever appended to the list).

Also stored in Validator:

pubkey serves as both the unique identity of the validator and the means of cryptographically verifying messages purporting to have been signed by it. The public key is stored raw, unlike in Eth1, where it is hashed to form the account address. This allows public keys to be aggregated for verifying aggregated attestations.
Validators actually have two private/public key pairs, the one above used for signing protocol messages, and a separate "withdrawal key". withdrawal_credentials is a commitment generated from the validator's withdrawal key so that, at some time in the future, a validator can prove it owns the funds and will be able to withdraw them. There are two types of withdrawal credential currently defined, one corresponding to BLS keys, and one corresponding to standard Ethereum ECDSA keys.
effective_balance is a topic of its own that we've touched upon already, and will discuss more fully when we look at effective balances updates.
slashed indicates that a validator has been slashed, that is, punished for violating the slashing conditions. A validator can be slashed only once.
The remaining values are the epochs in which the validator changed, or is due to change state.

A detailed explanation of the stages in a validator's lifecycle is here, and we'll be covering it in detail as we work through the beacon chain logic. But, in simplified form, progress is as follows:

A 32 ETH deposit has been made on the Ethereum 1 chain. No validator record exists yet.
The deposit is processed by the beacon chain at some slot. A validator record is created with all epoch fields set to FAR_FUTURE_EPOCH.
At the end of the current epoch, the activation_eligibility_epoch is set to the next epoch.
After the epoch activation_eligibility_epoch has been finalised, the validator is added to the activation queue by setting its activation_epoch appropriately, taking into account the per-epoch churn limit and MAX_SEED_LOOKAHEAD.
On reaching activation_epoch the validator becomes active, and should carry out its duties.
At any time after SHARD_COMMITTEE_PERIOD epochs have passed, a validator may request a voluntary exit. exit_epoch is set according to the validator's position in the exit queue and MAX_SEED_LOOKAHEAD, and withdrawable_epoch is set MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after that.
From exit_epoch onward the validator is no longer active. There is no mechanism for exited validators to rejoin: exiting is permanent.
After withdrawable_epoch, the validator's balance can in principle be withdrawn, although there is no mechanism for doing this for the time being.

The above does not account for slashing or forced exits due to low balance.

`AttestationData`

class AttestationData(Container):
    slot: Slot
    index: CommitteeIndex
    # LMD GHOST vote
    beacon_block_root: Root
    # FFG vote
    source: Checkpoint
    target: Checkpoint

The beacon chain relies on a combination of two different consensus mechanisms: LMD GHOST keeps the chain moving, and Casper FFG brings finalisation. These are documented in the Gasper paper. Attestations from (committees of) validators are used to provide votes simultaneously for each of these consensus mechanisms.

This container is the fundamental unit of attestation data. It provides the following elements.

slot: each active validator should be making exactly one attestation per epoch. Validators have an assigned slot for their attestation, and it is recorded here for validation purposes.
index: there can be several committees active in a single slot. This is the number of the committee that the validator belongs to in that slot. It can be used to reconstruct the committee to check that the attesting validator is a member. Ideally, all (or the majority at least) of the attestations in a slot from a single committee will be identical, and can therefore be aggregated into a single aggregate attestation.
beacon_block_root is the validator's vote on the head block for that slot after locally running the LMD GHOST fork-choice rule. It may be the root of a block from a previous slot if the validator believes that the current slot is empty.
source is the validator's opinion of the best currently justified checkpoint for the Casper FFG finalisation process.
target is the validator's opinion of the block at the start of the current epoch, also for Casper FFG finalisation.

This AttestationData structure gets wrapped up into several other similar but distinct structures:

Attestation is the form in which attestations normally make their way around the network. It is signed and aggregatable, and the list of validators making this attestation is compressed into a bitlist.
IndexedAttestation is used primarily for attester slashing. It is signed and aggregated, with the list of attesting validators being an uncompressed list of indices.
PendingAttestation. In Phase 0, after having their validity checked during block processing, PendingAttestations were stored in the beacon state pending processing at the end of the epoch. This was reworked in the Altair upgrade and PendingAttestations are no longer used.

`IndexedAttestation`

class IndexedAttestation(Container):
    attesting_indices: List[ValidatorIndex, MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is one of the forms in which aggregated attestations – combined identical attestations from multiple validators in the same committee – are handled.

Attestations and IndexedAttestations contain essentially the same information. The difference being that the list of attesting validators is stored uncompressed in IndexedAttestations. That is, each attesting validator is referenced by its global validator index, and non-attesting validators are not included. To be valid, the validator indices must be unique and sorted, and the signature must be an aggregate signature from exactly the listed set of validators.

IndexedAttestations are primarily used when reporting attester slashing. An Attestation can be converted to an IndexedAttestation using get_indexed_attestation().

`PendingAttestation`

class PendingAttestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    inclusion_delay: Slot
    proposer_index: ValidatorIndex

PendingAttestations were removed in the Altair upgrade and now appear only in the process for upgrading the state during the fork. The following is provided for historical reference.

Prior to Altair, Attestations received in blocks were verified then temporarily stored in beacon state in the form of PendingAttestations, pending further processing at the end of the epoch.

A PendingAttestation is an Attestation minus the signature, plus a couple of fields related to reward calculation:

inclusion_delay is the number of slots between the attestation having been made and it being included in a beacon block by the block proposer. Validators are rewarded for getting their attestations included in blocks, but the reward used to decline in inverse proportion to the inclusion delay. This incentivised swift attesting and communicating by validators.
proposer_index is the block proposer that included the attestation. The block proposer gets a micro reward for every validator's attestation it includes, not just for the aggregate attestation as a whole. This incentivises efficient finding and packing of aggregations, since the number of aggregate attestations per block is capped.

Taken together, these rewards are designed to incentivise the whole network to collaborate to do efficient attestation aggregation (proposers want to include only well-aggregated attestations; validators want to get their attestations included, so will ensure that they get well aggregated).

With Altair, this whole mechanism has been replaced by ParticipationFlags.

`Eth1Data`

class Eth1Data(Container):
    deposit_root: Root
    deposit_count: uint64
    block_hash: Hash32

Proposers include their view of the Ethereum 1 chain in blocks, and this is how they do it. The beacon chain stores these votes up in the beacon state until there is a simple majority consensus, then the winner is committed to beacon state. This is to allow the processing of Eth1 deposits, and creates a simple "honest-majority" one-way bridge from Eth1 to Eth2. The 1/2 majority assumption for this (rather than 2/3 for committees) is considered safe as the number of validators voting each time is large: EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH = 64 * 32 = 2048.

deposit_root is the result of the get_deposit_root() method of the Eth1 deposit contract after executing the Eth1 block being voted on - it's the root of the (sparse) Merkle tree of deposits.
deposit_count is the number of deposits in the deposit contract at that point, the result of the get_deposit_count method on the contract. This will be equal to or greater than (if there are pending unprocessed deposits) the value of state.eth1_deposit_index.
block_hash is the hash of the Eth1 block being voted for. This doesn't have any current use within the Eth2 protocol, but is "too potentially useful to not throw in there", to quote Danny Ryan.

`HistoricalBatch`

class HistoricalBatch(Container):
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]

This is used to implement part of the double batched accumulator for the past history of the chain. Once SLOTS_PER_HISTORICAL_ROOT block roots and the same number of state roots have been accumulated in the beacon state, they are put in a HistoricalBatch object and the hash tree root of that is appended to the historical_roots list in beacon state. The corresponding block and state root lists in the beacon state are circular and just get overwritten in the next period. See process_historical_roots_update().

`DepositMessage`

class DepositMessage(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei

The basic information necessary to either add a validator to the registry, or to top up an existing validator's stake.

pubkey is the unique public key of the validator. If it is already present in the registry (the list of validators in beacon state) then amount is added to its balance. Otherwise a new Validator entry is appended to the list and credited with amount.

See the Validator container for more on withdrawal_credentials.

There are two protections that DepositMessages get at different points.

DepositData is included in beacon blocks as a Deposit, which adds a Merkle proof that the data has been registered with the Eth1 deposit contract.
When the containing beacon block is processed, deposit messages are stored, pending processing at the end of the epoch, in the beacon state as DepositData. This includes the pending validator's BLS signature so that the authenticity of the DepositMessage can be verified before a validator is added.

`DepositData`

class DepositData(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei
    signature: BLSSignature  # Signing over DepositMessage

A signed DepositMessage. The comment says that the signing is done over DepositMessage. What actually happens is that a DepositMessage is constructed from the first three fields; the root of that is combined with DOMAIN_DEPOSIT in a SigningData object; finally the root of this is signed and included in DepositData.

`BeaconBlockHeader`

class BeaconBlockHeader(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body_root: Root

A standalone version of a beacon block header: BeaconBlocks contain their own header. It is identical to BeaconBlock, except that body is replaced by body_root. It is BeaconBlock-lite.

BeaconBlockHeader is stored in beacon state to record the last processed block header. This is used to ensure that we proceed along a continuous chain of blocks that always point to their predecessor¹. See process_block_header().

The signed version is used in proposer slashings.

`SyncCommittee`

class SyncCommittee(Container):
    pubkeys: Vector[BLSPubkey, SYNC_COMMITTEE_SIZE]
    aggregate_pubkey: BLSPubkey

Sync committees were introduced in the Altair upgrade to support light clients to the beacon chain protocol. The list of committee members for each of the current and next sync committees is stored in the beacon state. Members are updated every EPOCHS_PER_SYNC_COMMITTEE_PERIOD epochs by get_next_sync_committee().

Including the aggregate_pubkey of the sync committee is an optimisation intended to save light clients some work when verifying the sync committee's signature. All the public keys of the committee members (including any duplicates) are aggregated into this single public key. If any signatures are missing from the SyncAggregate, the light client can "de-aggregate" them by performing elliptic curve subtraction. As long as more than half of the committee contributed to the signature, then this will be faster than constructing the aggregate of participating members from scratch. If less than half contributed to the signature, the light client can start instead with the identity public key and use elliptic curve addition to aggregate those public keys that are present.

`SigningData`

class SigningData(Container):
    object_root: Root
    domain: Domain

This is just a convenience container, used only in compute_signing_root() to calculate the hash tree root of an object along with a domain. The resulting root is the message data that gets signed with a BLS signature. The SigningData object itself is never stored or transmitted.

It's a blockchain, yo!↩