Custom Types
The specification defines the following Python custom types, "for type hinting and readability": the data types defined here appear frequently throughout the spec; they are the building blocks for everything else.
Each type has a name, an "SSZ equivalent", and a description. SSZ is the encoding method used to pass data between clients, among other things. Here it can be thought of as just a primitive data type.
Throughout the spec, (almost) all integers are unsigned 64-bit numbers, uint64
, but this hasn't always been the case.
Regarding "unsigned", there was much discussion around whether Eth2 should use signed or unsigned integers, and eventually unsigned was chosen. As a result, it is critical to preserve the order of operations in some places to avoid inadvertently causing underflows since negative numbers are forbidden.
And regarding "64-bit", early versions of the spec used other bit lengths than 64 (a "premature optimisation"), but arithmetic integers are now standardised at 64 bits throughout the spec, the only exception being ParticipationFlags
, introduced in the Altair upgrade, which has type uint8
, and is really a byte
type.
Name | SSZ equivalent | Description |
---|---|---|
Slot |
uint64 |
a slot number |
Epoch |
uint64 |
an epoch number |
CommitteeIndex |
uint64 |
a committee index at a slot |
ValidatorIndex |
uint64 |
a validator registry index |
Gwei |
uint64 |
an amount in Gwei |
Root |
Bytes32 |
a Merkle root |
Hash32 |
Bytes32 |
a 256-bit hash |
Version |
Bytes4 |
a fork version number |
DomainType |
Bytes4 |
a domain type |
ForkDigest |
Bytes4 |
a digest of the current fork data |
Domain |
Bytes32 |
a signature domain |
BLSPubkey |
Bytes48 |
a BLS12-381 public key |
BLSSignature |
Bytes96 |
a BLS12-381 signature |
ParticipationFlags |
uint8 |
a succinct representation of 8 boolean participation flags |
Transaction |
ByteList[MAX_BYTES_PER_TRANSACTION] |
either a typed transaction envelope or a legacy transaction |
ExecutionAddress |
Bytes20 |
Address of account on the execution layer |
Slot
Time is divided into fixed length slots. Within each slot, exactly one validator is randomly selected to propose a beacon chain block. The progress of slots is the fundamental heartbeat of the beacon chain.
Epoch
Sequences of slots are combined into fixed-length epochs.
Epoch boundaries are the points at which the chain can be justified and finalised (by the Casper FFG mechanism). They are also the points at which validator balances are updated, validator committees get shuffled, and validator exits, entries, and slashings are processed. That is, the main state-transition work is performed per epoch, not per slot.
Epochs have always felt like a slightly uncomfortable overlay on top of the slot-by-slot progress of the beacon chain, but necessitated by Casper FFG finality. There have been proposals to move away from epochs, and there are possible future developments that could allow us to do away with epochs entirely. But, for the time being, they remain.
Fun fact: Epochs were originally called Cycles.
CommitteeIndex
Validators are organised into committees that collectively vote (make attestations) on blocks. Each committee is active at exactly one slot per epoch, but several committees are active at each slot. The CommitteeIndex
type is an index into the list of committees active at a slot.
The beacon chain's committee-based design is a large part of what makes it practical to implement while maintaining security. If all validators were active all the time, there would be an overwhelming number of messages to deal with. The random shuffling of committees make them very hard to subvert by an attacker without a supermajority of stake.
ValidatorIndex
Each validator making a successful deposit is consecutively assigned a unique validator index number that is permanent, remaining even after the validator exits. It is permanent because the validator's balance is associated with its index, so the data needs to be preserved when the validator exits, at least until the balance is withdrawn at an unknown future time.
Gwei
All Ether amounts on the consensus layer are specified in units of Gwei ( Wei, Ether). This is basically a hack to avoid having to use integers wider than 64 bits to store validator balances and while doing calculations, since Wei is only 18 Ether. Even so, in some places care needs to be taken to avoid arithmetic overflow when dealing with Ether calculations.
Root
Merkle roots are ubiquitous in the Eth2 protocol. They are a very succinct and tamper-proof way of representing a lot of data, an example of a cryptographic accumulator. Blocks are summarised by their Merkle roots; state is summarised by its Merkle root; the list of Eth1 deposits is summarised by its Merkle root; the digital signature of a message is calculated from the Merkle root of the data structure contained within the message.
Hash32
Merkle roots are constructed with cryptographic hash functions. In the spec, a Hash32
type is used to represent Eth1 block roots (which are also Merkle roots).
I don't know why only the Eth1 block hash has been awarded the Hash32
type: other hashes in the spec remain Bytes32
. In early versions of the spec Hash32
was used for all cryptographic has quantities, but this was changed to Bytes32
.
Anyway, it's worth taking a moment in appreciation of the humble cryptographic hash function. The hash function is arguably the single most important algorithmic innovation underpinning blockchain technology, and in fact most of our online lives. Easily taken for granted, but utterly critical in enabling our modern world.
Version
Unlike Ethereum 11, the beacon chain has an in-protocol concept of a version number. It is expected that the protocol will be updated/upgraded from time to time, a process commonly known as a "hard-fork". For example, the upgrade from Phase 0 to Altair took place on the 27th of October 2021, and was assigned its own fork version. Similarly, the upgrade from Altair to Bellatrix was assigned a different fork version.
Version
is used when computing the ForkDigest
.
DomainType
DomainType
is just a cryptographic nicety: messages intended for different purposes are tagged with different domains before being hashed and possibly signed. It's a kind of name-spacing to avoid clashes; probably unnecessary, but considered a best-practice. Ten domain types are defined in Bellatrix.
ForkDigest
ForkDigest
is the unique chain identifier, generated by combining information gathered at genesis with the current chain Version
identifier.
The ForkDigest
serves two purposes.
- Within the consensus protocol to prevent, for example, attestations from validators on one fork (that maybe haven't upgraded yet) being counted on a different fork.
- Within the networking protocol to help to distinguish between useful peers that on the same chain, and useless peers that are on a different chain. This usage is described in the Ethereum 2.0 networking specification, where
ForkDigest
appears frequently.
Specifically, ForkDigest
is the first four bytes of the hash tree root of the ForkData
object containing the current chain Version
and the genesis_validators_root
which was created at beacon chain initialisation. It is computed in compute_fork_digest()
.
Domain
Domain
is used when verifying protocol messages validators. To be valid, a message must have been combined with both the correct domain and the correct fork version. It calculated as the concatenation of the four byte DomainType
and the first 28 bytes of the fork data root.
BLSPubkey
BLS (Boneh-Lynn-Shacham) is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol. Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 (Barreto-Lynn-Scott) elliptic curve.
The BLSPubkey
type holds a validator's public key, or the aggregation of several validators' public keys. This is used to verify messages that are claimed to have come from that validator or group of validators.
In Ethereum 2.0, BLS public keys are elliptic curve points from the BLS12-381 group, thus are 48 bytes long when compressed.
See the section on BLS signatures in part 2 for a more in-depth look at these things.
BLSSignature
As above, we are using BLS signatures over the BLS12-381 elliptic curve in order to sign messages between participants. As with all digital signature schemes, this guarantees both the identity of the sender and the integrity of the contents of any message.
In Ethereum 2.0, BLS signatures are elliptic curve points from the BLS12-381 group, thus are 96 bytes long when compressed.
ParticipationFlags
The ParticipationFlags
type was introduced in the Altair upgrade as part of the accounting reforms.
Prior to Altair, all attestations seen in blocks were stored in state for two epochs. At the end of an epoch, finality calculations, and reward and penalty calculations for each active validator, would be done by processing all the attestations for the previous epoch as a batch. This created a spike in processing at epoch boundaries, and led to a noticeable increase in late blocks and attestations during the first slots of epochs. With Altair, participation flags are now used to continuously track validators' attestations, reducing the processing load at the end of epochs.
Three of the eight bits are currently used; five are reserved for future use.
As an aside, it might have been more intuitive if ParticipationFlags
were a Bytes1
type, rather than introducing a weird uint8
into the spec. After all, it is not used as an arithmetic integer. However, Bytes1
is a composite type in SSZ, really an alias for Vector[uint8, 1]
, whereas uint8
is a basic type. When computing the hash tree root of a List
type, multiple basic types can be packed into a single leaf, while composite types take a leaf each. This would result in 32 times as many hashing operations for a list of Bytes1
. For similar reasons the type of ParticipationFlags
was changed from bitlist
to uint8
.
Transaction
The Transaction type was introduced in the Bellatrix pre-Merge upgrade to allow for Ethereum transactions to be included in beacon blocks. It appears in ExecutionPayload
objects.
Transactions are completely opaque to the beacon chain and are exclusively handled in the execution layer. A note reflecting this is included in the Bellatrix specification:
Note: The
Transaction
type is a stub which is not final.
The maximum size of a transaction is MAX_BYTES_PER_TRANSACTION
which looks huge, but since the underlying type is an SSZ ByteList
(which is a List
), a Transaction object will only occupy as much space as necessary.
ExecutionAddress
The ExecutionAddress type was introduced in the Bellatrix pre-Merge upgrade to represent the fee recipient on the execution chain for beacon blocks that contain transactions. It is a normal, 20-byte, Ethereum address, and is used in the ExecutionPayload
class.
References
- A primer on Merkle roots.
- See also Wikipedia on Merkle Trees.
- I have written an intro to the BLS12-381 elliptic curve elsewhere.
- Ethereum 1.0 introduced a fork identifier as defined in EIP-2124 which is similar to
Version
, but the Eth1 fork ID is not part of the consensus protocol and is used only in the networking protocol.↩