Part 2: Technical Overview
- LMD GHOST is a fork choice rule used by nodes to determine the best chain.
- It assigns weights to branches based on votes from all active validators.
- LMD GHOST does not provide finality, but does support a confirmation rule.
- Slashing is used to solve the "nothing at stake" problem.
In this section we will consider LMD GHOST in isolation, ignoring completely the Casper FFG finality overlay1. LMD GHOST is the essence of a consensus mechanism in itself – it is a fork choice rule, just as the heaviest chain rule in Nakamoto consensus is – and has its own sets of properties and trade-offs.
For now, we will be considering only the "how it works" part of the story - the happy flow. We will look at the "how it can go wrong" part later, in the Issues and Fixes section.
The name LMD GHOST comprises two acronyms, for "Latest Message Driven", and "Greedy Heaviest-Observed Sub-Tree". We'll unpack GHOST first, then LMD.
The GHOST protocol comes from a 2013 paper by Sompolinsky and Zohar about how to safely improve transaction throughput on Bitcoin. Increasing the block size, or decreasing the interval between blocks, makes the chain more susceptible to forking in a network that has uncontrolled latency (delays), like the Internet. Chains that fork have more reorgs, and reorgs are bad for transaction security. Replacing Bitcoin's longest chain fork choice rule with the GHOST fork choice was shown to be more stable in the presence of latency, allowing block production to be more frequent.
The name GHOST stands for "Greedy Heaviest-Observed Sub-Tree", which describes how the algorithm works. We will expand on that below. In short, GHOST's fork choice doesn't follow the heaviest chain, but the heaviest subtree. It recognises that a vote for a block is not only for that block, but implicitly a vote for each of its ancestors as well, so whole subtrees have an associated weight.
Bitcoin never adopted GHOST, and (despite that paper stating otherwise) neither did Ethereum under proof of work, although it had originally been planned, and the old proof of work "uncle" rewards were related to it.
The GHOST protocol that we are using in Ethereum's proof of stake has been extended to be able to handle attestations. In proof of work, the voters are the block proposers. They vote for a branch by building their own block on top of it. In our proof of stake protocol, all validators are voters, and each casts a vote for its view of the network once every 6.4 minutes on average by publishing an attestation. So, under PoS, we have a lot more information available about participants' views of the network.
This is what it means to be "message driven", giving us the MD in LMD. The fork choice is driven not by blocks added by proposers, but by messages (attestations, votes) published by all validators.
The "L" stands for "latest": LMD GHOST takes into account only the latest message from each validator, that is, the most recent attestation that we have received from that validator. All a validator's earlier messages are discarded, but its latest vote is retained and has weight indefinitely.
As a side note, other versions of message-driven GHOST are available. Vitalik initially favoured IMD, "Immediate Message Driven", GHOST. As far as I can tell2, this retains all attestations indefinitely, and the fork choice chooses based on whatever attestation was current at the time. Then there's FMD, "Fresh Message Driven", GHOST, which considers attestations only from the current and previous epochs. And RLMD, "Recent Latest Message Driven", GHOST which remembers validators' latest attestations only for a parameterisable number of epochs.
LMD GHOST, above all, is a fork choice rule. Given a tree of blocks and a collection of votes, LMD GHOST will tell me which block I should consider to be the best head, thereby giving me a linear view of history from that head all the way back to genesis. The decision is based on my local view of the chain, based on the messages (blocks and attestations) that my node has received - remember, there is no "God's eye view", my local view is all that I have to work with, and it may well differ from other nodes' local views. The idea is that honest validators will build their blocks on the best head that they see, and will in turn cast their votes according the best head block they see.
Some things that a good fork choice rule will deliver are as follows.3
- Majority honest progress: if over 50% of nodes build blocks by following the fork choice rule, the chain progresses and is (exponentially) unlikely to revert older blocks.
- Stability: the fork choice is a good prediction of the future fork choice.
- Manipulation resistance: even if an attacker captures a temporary supermajority of some small set of participants, the attacker is unlikely to be able to revert blocks.
All these points are interrelated. Stability, in particular, is important for block proposers. When I propose a block, I want to be as sure as I possibly can be that the block will remain in the chain forever. Equivalently, that it will not be reorged out. Finding the head block means finding the block that most likely will make my new block the head in the views of other nodes when I build on it.
We'll divide our exploration of how LMD GHOST works in two. We'll look first at the LMD part, latest messages, and then at the GHOST part, finding the head.
Messages, in this context, are the head block votes found in attestations.
In an attestation's data, the head vote is the
class AttestationData(Container): slot: Slot index: CommitteeIndex # LMD GHOST vote beacon_block_root: Root # FFG vote source: Checkpoint target: Checkpoint
Every honest validator makes an attestation exactly once per epoch, containing its vote for the best head block in its view at the moment that it attests. Within each epoch, the validator set is split up so that only of the validators are attesting at each slot. (The
index field in this structure relates to the attesting validators being further divided into up to 64 committees at each slot for operational reasons, but this is not relevant to the mechanics of LMD GHOST and we shall ignore it.)
Nodes receive attestations both directly, via attestation gossip, and indirectly, contained in blocks. In principle, a node could receive attestations through other means – I could type an attestation in at the keyboard if I wished – but in practice votes propagate only via attestation gossip and within blocks.
On receiving an attestation, by whatever means, the node calls the fork choice's
on_attestation() handler. Before proceeding, the
on_attestation() handler performs some basic validity checks on the attestation:
- Is it too old?
- It must be from the current or previous epoch. See Attestation consideration delay.
- Is it too new?
- It must be from no later than the previous slot. See Attestation recency.
- Do we know about the block that it is voting for, the
- We must have received that block already. If not we might try to fetch it from a peer.
- Is its signature correct?
- Validators sign attestations and are accountable for them.
- Is the attestation slashable?
- We must ignore attestations that conflict with other attestations. See Attestation equivocation.
After passing these and some other checks, the attestation is considered for insertion into the node's Store, which is its repository of information about the state of the chain: its view. This is done in
update_latest_messages(). If we don't already have a head block vote for the validator, then this one is stored for it. If we have a head block vote for the validator, then this one replaces it if it is more recent.
Over time, then, a node's Store builds up a list containing a single latest vote for each validator that it has heard from.
Note that a vote can be inserted in the store only if we heard about it in the same epoch or the epoch after it was made. However, once it is in the store it remains there indefinitely, and continues to contribute to the fork choice until it is updated with a more recent vote. This is a key difference between LMD GHOST and, say, the Goldfish protocol, or RLMD GHOST.
In essence, the LMD GHOST fork choice rule is a function . As we've seen, a node's Store is its view of the world: all the relevant information it has received that could affect the fork choice. For the pure LMD GHOST algorithm we are looking at here, the relevant parts of the Store are the following.
- The block tree, which is just a list of blocks. The blocks' parent links join them logically into a tree.
- The list of latest messages (votes) from validators.
- The validators' effective balances (based on some state) as these provide the weights used in the algorithm.
The goal of the GHOST algorithm is to select a single leaf block from the given block tree, where a leaf is a block without any descendants. This will be our chosen head block.
We are going to assume that all the blocks in our block tree descend from a single root block. In a pure GHOST algorithm, that would be the genesis block: by definition, all blocks must descend from genesis. In our full consensus implementation, that root block will be the last justified checkpoint block. For our present purposes, all we need to know is that the GHOST algorithm starts from a given block, and ignores all blocks not descended from that block.
The first thing we do is calculate the "weight" of each branch in the tree. A branch's weight is its score, in some sense.
The weight of a vote is the effective balance of the validator that made the vote. This will usually be 32 ETH, the maximum effective balance, but could be less. A vote, recall, is the latest message we have from that validator.
The weight of a branch is the weight of the votes for the block that roots it, plus the weights of that block's child branches. By including the weights of child branches, we are acknowledging that a vote for a block is also a vote for each of the ancestors of that block. The weight of a branch consisting of only a leaf block will be just the weight of the votes for that block.
Some obvious relationships apply between the weights, , of branches, and , the weights of the votes for blocks.
- For a branch comprising only a leaf block, , .
- The weight of a branch is the weight of the votes for the block at its root plus the sum of the weights of all branches below it. So, in the diagram, .
- The weight of a branch is the sum of the weights of all votes for blocks in the subtree that forms the branch.
Since votes always carry a positive weight, no block will have a greater weight than the root block, and the root block's weight is the sum of the weights of all the votes for blocks in the tree. Every validator has at most one latest message – that is, one vote –, so that weight is bounded above by the total effective balance of all active validators.4
Once we have the weight of each branch or subtree, the algorithm proceeds recursively. Given a block, we select the heaviest branch descending from it. We then repeat the process with the block at that branch's root. If a block has only one child, then the choice is obvious and there is no work to do. If two or more branches have equal weight, we arbitrarily choose the branch rooted at the child block with the highest block hash value. Eventually we will reach a block with no children. This is a leaf block and will be the output of the algorithm.
Unpacking the GHOST name, we see that the algorithm: is Greedy, meaning that it takes the Heaviest-Observed branch immediately, without looking further; and deals with Sub-Trees, the weight of a branch being the sum of all the weights of votes for blocks in the subtree.
Here's a simple example. In the diagrams, I've distinguished between (1) the weight of the votes for a particular block, which are the numbers attached to each block, and (2) the weights of branches, which I've added to the lines joining the blocks to their parents.
First, from the latest messages in the Store, we calculate the weight of votes for each block in the tree.
Second, from the weights of votes for each block, we can calculate the weight of each branch or subtree.
Third, we move recursively through the blocks at the roots of the subtrees, always choosing the branch or subtree with the highest weight. Eventually, we will reach a leaf, which is our chosen head block.
If, say, the subtrees rooted at and had equal weight, we would choose according to which of blocks and had the greatest block hash - this is a completely arbitrary tie-break mechanism.
The code that implements all this in the specification is
get_head(), which walks up through the tree from the root, taking the heaviest branch at each fork. The code for calculating the weight of a subtree is
If you look at the
get_weight(), you'll find that it is more complex than what we've covered here, due to something called "proposer boost". We shall discuss proposer boost in detail in the Issues and Fixes section.
Having seen how the GHOST protocol works, it's perhaps easier to gain some intuition for why we prefer it to a longest chain rule. The occurrence of forks suggests that block propagation time has become of similar order to, or exceeds, block production intervals (slots). In short, not all validators are seeing all the blocks in time to attest to them or to build on them.5
In these circumstances, we want to take advantage of the maximum amount of information available. Votes for two different children of the same parent block should be taken as confirmation that all those validators favour the parent block's branch, even if there is disagreement about the child blocks. GHOST achieves this simply by allowing a vote for a child block to add weight to all of its ancestors. Thus, when faced with a choice, we favour the branch with the greatest total support from validators. I've illustrated this in the diagram above: branch is favoured over branch , even though block has more direct votes than block , since more validators overall made latest votes for branch .
The longest chain rule discards all this information, and can allow a branch to win even if only a minority of validators has been working on it.
In proof of work, the only data available for input into the fork choice comes from block production, which represents the view of a single miner at the time of publishing the block.
In Ethereum's proof of stake protocol, we have vastly more information available to us, in the form of head block votes from of the validator set every 12 seconds, in addition to the block proposer's view.
In principle, all this extra information ought to allow us to be very sure, very quickly, about whether blocks will remain canonical or are in danger of being reverted. Proof of work chains tend to use a heuristic around the number of confirmations that a block has received. That is, blocks are assumed to be exponentially less likely to be reverted the more blocks have been built on top of them. This is generally true, but can fail badly in high-latency environments (such as an attacker making a longer chain in secret).
Proof of stake's ultimate answer to this is finality, which we shall look at in the next section. LMD GHOST on its own does not provide finality. However, it's interesting to consider whether there is some heuristic analogous to proof of work's confirmations rule that we can use, and it turns out that there is. In fact, it improves on the PoW confirmation rule by giving a "yes/no" statement about the safety of a block, rather than a probability.
The details of the confirmation rule are described by Aditya Asgaonkar in a blog post and an accompanying paper. The general idea is that, for a block , we calculate two values and . When , and the network remains close to synchronous, then that block is confirmed as "safe". We can be fully confident that it will not be reorged.
The quantity is defined as the weight of the subtree rooted at at slot divided by the total weight of votes cast since was produced. Simply put, if, in the slots since was proposed, 80% of validators voted for or a descendant of , then would be .
is defined as , where is the fraction of stake we believe is controlled by an adversary. This fraction is unknown, but is assumed to be less than one third, otherwise we have big problems.
Now, if for and all its (non-finalised) ancestors, , then is considered to be confirmed, or "safe".
The idea is that, once a branch up to block has accumulated a simple majority of the available voting weight, then all honest validators will continue to vote for that branch, so it will maintain its majority indefinitely. One way this can fail is if dishonest validators swap their votes to another branch, giving that branch a majority instead. That's why we add a fraction to the basic majority safety parameter of , so that, even if all the dishonest validators swap branches, our branch maintains a majority. The other way this can fail is if the network begins to suffer delays or partitions (loses synchrony), so that some honest validators cannot see other honest validators' votes, hence the reliance on the network remaining sufficiently synchronous.
While this seems very intuitive, there are some important subtleties and complications related to the integration with Casper FFG that modify the confirmation rule, so anyone dealing with this should consult the full paper (which remains a draft). In addition, proposer boost makes it easier for an adversarial block proposer to reorg a block, so we need to account for that as well.
The table below summarises the differences between the confirmation rule and finality for a block.
|Time||One slot in ideal circumstances, less than a minute more generally.||At least two epochs / 13 minutes.|
|Assumptions||Network remains synchronous until finality.||No synchrony assumption.|
|Breakage||A confirmed block can be reorged if the network does not remain synchronous.||A conflicting block can be finalized if more than of the validators commit a slashable action.|
The confirmation rule is not yet implemented in client software as I write, but should be available in due course via the safe block specification.
One of the ways cryptoeconomic systems secure themselves is by rewarding good behaviour and penalising bad behaviour. In our implementation of LMD GHOST, both proposers and attesters are rewarded in different ways for accurately finding the head of the chain.
The incentive for a block proposer to build on the best head is clear. If it doesn't, then there's a good chance that its block will not be included in the eventual canonical chain – it will be orphaned – in which case the proposer will not receive any of its block rewards. This is an implicit incentive rather than explicit; miners in proof of work are in a similar situation.
By contrast, validators are directly rewarded for voting accurately. When a validator makes an accurate head vote, and its attestation is included in a block in the very next slot, it receives a micro reward. A perfectly performing validator will gain about 22% of its total protocol rewards from making accurate head votes. Proposers, in turn, are incentivised to include such attestations in blocks as they receive a proportionate micro, micro reward for each one they manage to get in.
A vote is accurate if it matches what ends up in the canonical chain. So if the validator voted for a block at the previous slot, and the canonical chain has a matching block there, then it receives its reward. If it voted for a block from a previous slot, indicating a skipped slot, and the canonical chain has skipped slots between that block and the present, then it will also receive its reward.
There is no penalty if a validator makes an inaccurate head vote, or their head vote is not included on chain within one slot. Getting the head vote right is difficult when the beacon chain is under stress, and there are late and missing blocks. This is often no fault of the validator itself, and it was felt unfair to penalise them in such circumstances.6
One of the big breakthroughs in proof of stake design was the adoption of slashing as a way around the "nothing at stake" problem. The problem, in essence, is that under proof of stake it is almost costless for a validator to equivocate by publishing multiple contradictory messages.
The solution turns out to be rather elegant. We detect when a validator has equivocated and punish it. The punishment is called "slashing", and entails removing a proportion of the validator's stake, and ejecting the validator from the protocol. Since validators digitally sign their messages, finding two contradictory signed messages is absolute proof of misbehaviour, so we can slash with confidence.
The name "slashing" comes from Vitalik's Slasher algorithm from early 2014. That was a very early proposal for solving the nothing at stake problem. Our current design doesn't look much like Slasher, but some things have carried over, not least the name.
When it is a validator's turn to produce a block in a particular slot, the validator is supposed to run the fork choice rule in order to decide which existing block it will build its own block on. Its goal is to identify the fork that is most likely, based on the evidence it has, to be the one that eventually becomes canonical. That is, the one that the whole set of correct validators will converge on.
However, why choose? Under proof of stake – unlike under proof of work – it is almost costless for validators to produce blocks. Therefore, a good strategy would seem to be to propose multiple blocks, one built on each possible head, then at least one of my blocks is guaranteed to become part of the eventual canonical chain.
This is undesirable because it prolongs any fork and prevents the network from converging on a linear history. Users of the chain may not be able to work out which fork is correct, and that makes them vulnerable to double spend attacks, the very thing we wish to avoid.
This illustrates the nothing at stake problem. As outlined above, the solution is to detect the two contradictory blocks and slash the validator that proposed them.
Proposer equivocation is not detected in-protocol, but relies on a third-party constructing a proof of equivocation in the form of a
ProposerSlashing object. The proof comprises just two signed beacon block headers: that's enough to prove that the proposer signed off on two blocks in the same slot. A subsequent block proposer will include the proof in a block (and be well rewarded for it), and the protocol will slash the offending validator's balance and eject it from the active validator set.
Similarly, when it is a validator's turn to publish an attestation, it is supposed to run its fork choice rule and vote for the head block that it thinks is best. The issue here is that an attacker could make multiple contradictory attestations in order to provoke or prolong forks and prevent the network from converging on a single chain. Even a mostly-honest validator might be tempted to make two attestations for a borderline late block: one voting for the block, one voting for an empty slot. If it weren't punishable behaviour, this would hedge the chance of voting the wrong way and missing out on a reward.
The remedy is the same: detect and punish contradictory attestations. That is, attestations made in by the same validator in the same slot for different head blocks.
The LMD GHOST fork choice in Ethereum has its origin's in Vlad Zamfir's work on the Casper CBC protocol (then known as Casper TFG, and not to be confused with Casper FFG, which is something quite different7). The initial announcement of Casper TFG was in 2015, and in his 2017 Casper the Friendly Ghost paper, Zamfir describes combining Sompolinsky and Zohar's GHOST protocol with a "latest message" construction.
In August 2018, Vitalik still favoured a fork choice called IMD GHOST (formerly known as Recursive Proximity to Justification), that was more aware of finalisation and justification than the pure LMD GHOST that we have today. As the Eth2 consensus mini-spec evolved, IMD GHOST was changed to LMD GHOST in November 20188. This was due to concerns about the stability properties of IMD GHOST.
That November 2018 description of LMD GHOST in the mini-spec is essentially what we are using today.
Bear in mind that this section has covered only the pure form of LMD GHOST. In Ethereum's full consensus protocol, LMD GHOST is modified by being integrated with Casper FFG, as we shall see in the Gasper section. It is also modified by proposer boost, which we shall also discuss later.
The fork choice document in the consensus specs repo contains the relevant specifications. They are covered in my annotated specification in the following places.
get_head()is the entry point to the fork choice when we want to get an opinion on the best current head block.
get_weight()is where the LMD GHOST algorithm is implemented.
on_attestation()handler is where the fork choice learns about LMD GHOST votes.
The introduction to Vitalik's annotated fork choice is an excellent overview (though some of the details in the spec itself are now outdated). The LMD GHOST part of Vitalik's CBC Casper tutorial has a nice explainer. Ignore the later stuff about finality in Casper CBC - that does not concern us.
Some weaknesses of LMD GHOST as it is currently implemented in Ethereum are reviewed in the RLMD GHOST paper. For example, LMD GHOST does not handle dynamic participation securely – that is, when large numbers of validators go offline – among other things. We will consider some of the issues later, but the introduction to that paper is a good read if you want to get a head start.
- The next section covers Casper FFG, and the one after that the combination of the two into Gasper.↩
- I've yet to find a lucid exposition of IMD GHOST. Looking back through the history on the original mini-spec gives some information, but it's hard to understand what was really happening. It was first known as "recursive proximity to justification", since it was bound up with Casper FFG in a way that LMD GHOST is not.↩
- I've adapted this from an old post of Vitalik's, PoS fork choice rule desiderata. I'm postponing his finality point for now. I am not aware of much formal analysis of LMD GHOST with respect to properties like these; in fact, our implementation of LMD GHOST may not do too well in some respects. But these goals are worth bearing in mind as we explore how the mechanism works.↩
- Ignoring proposer boost, which we shall deal with later.↩
- See Vitalik's Toward a 12-second block time for a fascinating analysis of this in a proof of work context. However, not much of it carries over to our PoS implementation, except that GHOST helps to make sense of a forkful network.↩
- The original Phase 0 specification did have a penalty for a missed head vote. This was removed in the Altair upgrade's accounting reforms.↩
- Welcome to the joy of naming things in Ethereum.↩
- You can view the history of changes to the mini-spec by clicking on the pencil icon near the title.↩