Part 2: Technical Overview

The Building Blocks

Committees

Committees are subsets of the full set of active validators that are used to distribute the overall workload.
Beacon committees manage attestations for the consensus protocol; sync committees are discussed elsewhere.
Having 64 beacon committees at each slot is a relic of previous Eth2 designs.
Nonetheless, multiple committees per slot allow us to parallelise attestation aggregation.
Beacon committee membership is random and transient.
A target minimum committee size of 128 protects them against capture.

Introduction

One of the challenges of building a highly scalable consensus protocol is organising the work involved so as not to overwhelm the network or individual nodes.

A goal of the Ethereum 2 Proof of Stake protocol is to achieve economic finality. In the current design (though see below for discussion of single slot finality) this requires us to gather votes from at least two-thirds of the validator set, and we must do this twice: once to justify an epoch, and once again to finalise it.

If the whole validator set were to attest simultaneously, the number of messages on the network would be immense, and the amount of work required of beacon nodes too much for modest hardware. This is where committees help. The work of attesting is divided among subsets of the validator set (committees) and spread across an epoch (6.4 minutes). Each validator participates in only one of the committees.

The Altair spec has two types of committees, beacon committees and sync committees, each having quite a different function. We will focus on beacon committees in this section, and deal with sync committees in a later section.

The current beacon committee structure was strongly influenced by a previous roadmap that included in-protocol data sharding. That design is now deprecated, yet a remnant of it remains in our 64 beacon committees per slot. These were originally intended to map directly to 64 shards as "crosslink committees" but no longer have that function. Nonetheless, beacon committees still serve a useful purpose in parallelising the aggregation of attestations. Whether 64 remains the right number of committees per slot has not been analysed to my knowledge. The trade-off is that fewer beacon committees would reduce the amount of block space needed for aggregate attestations, but would increase the time needed for aggregators to do their work.

In any case, logically, the 64 committees in a slot now act as a single large committee, all voting on the same information.

Committee assignments

Beacon committees are convened to vote exactly once and then disbanded immediately - they are completely transient. By contrast, a sync committee lasts for 256 epochs (a little over 27 hours), and votes 8192 times during that period.

During an epoch, every active validator is a member of exactly one beacon committee, so the committees are all disjoint. At the start of the next epoch, all the existing committees are disbanded and the active validator set is divided into a fresh set of committees.

The composition of the committees for an epoch is fully determined at the start of an epoch by (1) the active validator set for that epoch, and (2) the RANDAO seed value at the start of the previous epoch.

Here we have divided thirty circles and fifteen triangles into five committees at random. The attacking triangles do not have a majority in any committee.

We assign validators to committees randomly in order to defend against a minority attacker being able to capture any single committee. If committee assignments were not random, or were calculable long in advance, then it might be possible for an attacker with a minority of validators to organise them so that they became a supermajority in some committees. They might do this by manipulating the entries and exits of their validators, for example.

It would be improbable for the triangles to gain a 2/3 supermajority in a committee purely by chance. But if the attacker could manipulate the assignments then they might gain a supermajority in some committees, such as the first two here.

The committee sizes used in the Eth2 protocol were chosen to make the takeover of a committee by a minority attacker extremely unlikely. See target committee size, below, for further analysis of this.

The number of committees

The protocol adjusts the total number of committees in each epoch according to the number of active validators. The goals are,

to have the same number of committees per slot throughout the epoch (so the number of committees in an epoch is always a multiple of SLOTS_PER_EPOCH),
to have the largest number of committees that ensures that each committee has at least TARGET_COMMITTEE_SIZE members, and
to have a maximum of MAX_COMMITTEES_PER_SLOT committees per slot.

Clearly, the first goal is not achievable if there are fewer than SLOTS_PER_EPOCH validators – is a committee a committee if nobody is in it? – and the second goal is not achievable if there are fewer than SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE (4096) validators. The protocol could hardly be considered secure with fewer than 4096 validators, so this is not a significant issue in practice.

Every slot in an epoch has the same number of committees, $N$ , up to a maximum of MAX_COMMITTEES_PER_SLOT. Every active validator in the epoch appears in exactly one committee, thus the committees are all disjoint.

The number of committees per slot is calculated by the spec function get_committee_count_per_slot(). This can be simplified for illustrative purposes, given the number $n$ of active validators in the epoch, as

MAX_COMMITTEES_PER_SLOT = 64
SLOTS_PER_EPOCH = 32
TARGET_COMMITTEE_SIZE = 128
def committees_per_slot(n):
    return max(1, min(MAX_COMMITTEES_PER_SLOT, n // SLOTS_PER_EPOCH // TARGET_COMMITTEE_SIZE))

This generates a committee structure that evolves as per the following table as the number of validators grows or shrinks.

$n$ min	$n$ max	Committees / slot	Members per committee	Min	Max
$0$	$31$	$1$	Some committees have zero members	0	1
$32$	$4095$	$1$	${\lceil n / 32 \rceil}$ or ${\lfloor n / 32 \rfloor}$ , which is below `TARGET_COMMITTEE_SIZE`	1	128
$4096$	$262\,143$	${N = \lfloor n / 4096 \rfloor}$	${\lceil n / (32N) \rceil}$ or ${\lfloor n / (32N) \rfloor}$	128	256
$262\,144$	$4\,194\,304$	64	${\lceil n / 2048 \rceil}$ or ${\lfloor n / 2048 \rfloor}$	128	2048
$4\,194\,305$	-	64	Things break Note that this can never happen in practice.	-	-

The numbers at the various thresholds in this table are calculated from the spec constants:

32 is SLOTS_PER_EPOCH.
4096 is SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE. This is the point at which all the committees achieve their target minimum size.
262,144 is SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE * MAX_COMMITTEES_PER_SLOT. We have reached the maximum number of committees per slot (64). We no longer add new committees as the validator set grows, we just make the committees larger.
4,194,304 is SLOTS_PER_EPOCH * MAX_VALIDATORS_PER_COMMITTEE * MAX_COMMITTEES_PER_SLOT. There is not enough Ether in existence to allow us to reach this number of active validators. The limit exists in protocol to enable us to specify a maximum size for the aggregation_bits SSZ Bitlist type in attestations.

Committee index

Each of the $N$ committees within a slot has a committee index from $0$ to $N-1$ . I will call this $i$ in what follows and refer to it as the slot-based index. This slot-based index is included in committees' attestations via the AttestationData object,

class AttestationData(Container):
    slot: Slot
    index: CommitteeIndex
    # LMD GHOST vote
    beacon_block_root: Root
    # FFG vote
    source: Checkpoint
    target: Checkpoint

The slot and the committee index within that slot together uniquely identify a committee, and together with the RANDAO value, its membership.

Since all committees in a slot are voting on exactly the same information (source, target, and head block), the index is the only thing that varies between the aggregate attestations produced by the slot's committees (assuming that most of the validators have the same view of the network). This prevents the attestations from the slot's committees being aggregated further, so we will generally end up with $N$ aggregate attestations per slot that we must store in a beacon block.

If it were not for the index then all these $N$ aggregate attestations could be further aggregated into a single aggregate attestation, combining the votes from all the validators voting at that slot.

As a thought experiment we can calculate the potential space savings of doing this. Given a committee size of $k$ and $N$ committees per slot, the current space required for $N$ aggregate Attestation objects is $N * (229 + \lfloor k / 8 \rfloor)$ bytes. If we could remove the committee index from the signed data and combine all of these into a single aggregate Attestation the space required would be $221 + \lfloor kN / 8 \rfloor$ bytes. So we could save $229N - 221$ bytes per block, which is 14.4KB with the maximum 64 committees. This seems nice to have, but would likely make the committee aggregation process more complex.

There is another index that appears when assigning validators to committees in compute_committee(): an epoch-based committee index that I shall call $j$ . The indices $i$ and $j$ are related as $i = \mod(j, N)$ and $j = Ns + i$ where $s$ is the slot number in the epoch.

The size of committees

Validators are divided among the committees in an epoch by the compute_committee() function.

Given the epoch-based index $j$ , compute_committee() returns a slice of the full, shuffled validator set as the committee membership. Within the shuffled list, the index of the first validator in the committee is $\lfloor nj / 32N \rfloor$ , and the index of the last validator in the committee is $\lfloor n(j + 1) / 32N \rfloor - 1$ . So the size of each committee is either $\lfloor n / 32N \rfloor$ or $\lceil n / 32N \rceil$ . In any case, committee sizes differ by at most one.

In simplified form the compute_committee() calculation looks like this. N is the number of committees per slot, n is the total number of active validators, and j is the epoch-based committee index,

def compute_committee_size(n, j, N):
    start = n * j // (32 * N)
    end = n * (j + 1) // (32 * N)
    return end - start

The length of the vector returned will be either n // (32 * N) or 1 + n // (32 * N). The function compute_shuffled_index() is described in the previous section.

Conceptually, to calculate the committee assignments for an epoch, the entire active validator set is shuffled into a list of length $n$ , then sliced into $32N$ committees of as close to the same size as possible. $N$ is the number of committees per slot. The epoch-based committee number, $j$ , is shown.

In the caption to the diagram above I said that this is "conceptually" how committee membership is determined. In practice, due to our use of an oblivious shuffle, the membership of an individual committee can be calculated without shuffling the entire validator set; the result will be the same.

Target committee size

To achieve a desirable level of security, committees need to be larger than a certain size. This makes it infeasible for an attacker to randomly end up with a super-majority in a committee even if they control a significant number of validators. The target here is a kind of lower-bound on committee size. If there are not enough validators for all committees to have at least TARGET_COMMITTEE_SIZE (128) members, then, as a first measure, the number of committees per slot is reduced to maintain this minimum. Only if there are fewer than SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE (4096) validators in total will the committee size be reduced below TARGET_COMMITTEE_SIZE. With so few validators the system would be insecure in any case.

Given a proportion of the validator set controlled by an attacker, what is the probability that the attacker ends up controlling a two-thirds majority in a uniformly randomly selected committee drawn from the full set of validators? Vitalik calculated 111 to be the minimum committee size required to maintain a $2^{-40}$ chance (one-in-a-trillion) of an attacker with one third of the validators gaining by chance a two-thirds majority in any one committee. The value 128 was chosen as being the next higher power of two.

If an attacker has a proportion $p$ of the validator set, then the probability of selecting a committee of $n$ validators that has $k$ or more validators belonging to the attacker is,

\sum_{i=k}^{n} p^i{(1-p)}^{n-i}{n\choose i}

Using this we can calculate that, in fact, 109 members is sufficient to give only a $2^{-40}$ chance of an attacker with one third of the validators gaining a two-thirds majority by chance.

Code for calculating the target committee size

The following is Vitalik's Python code for calculating the probabilities.

def fac(n):
    return n * fac(n-1) if n else 1

def choose(n, k):
    return fac(n) / fac(k) / fac(n-k)

def prob(n, k, p):
    return p**k * (1-p)**(n-k) * choose(n,k)

def probge(n, k, p):
    return sum([prob(n,i,p) for i in range(k,n+1)])

Armed with this we find that the minimum committee size to avoid a two-thirds majority with a $2^{-40}$ probability is 109 rather than 111.

>>> probge(108, 72, 1.0 / 3) < 2**-40
False
>>> probge(109, 73, 1.0 / 3) < 2**-40
True

In any case, a committee size of 128 is very safe against an attacker with 1/3 of the stake:

>>> probge(128, 86, 1.0 / 3)
5.551560731791749e-15

Odds of one-in-trillion may sound like over-engineering, but we must also consider that an attacker might gain some power over the RANDAO, so some safety margin is desirable.

Notwithstanding all of this, in the current beacon chain design the minimum target committee size is irrelevant as committees never operate alone. As long as we have at least 8192 active validators, each slot has multiple committees all operating together and it is their aggregate size that confers security, not the size of any individual committee. As previously mentioned, the current committee design is influenced by an old data sharding model that is now superseded. Nonetheless, individual committees might find a role in future versions of the protocol, so the minimum target size is worth preserving.