Very important for distributed systems. In general, we just need to get several nodes to agree on something. This is important for:
- Replication
- Atomic Commits
- Unanimously deciding a winner during concurrency
- Membership services (which nodes are active members of the cluster)
- Locks and leases
This problem is formalized as:
- One or more nodes propose values and the consensus algorithm decides on one of the values. This must satisfy:
- Uniform agreement
- Integrity: no node decides twice
- Validity: If a node decides value v, then v was proposed by some node
- Termination: Every node that does not crash decides some value eventually. Each node that crashes is assumed to not come back. We need enough nodes to form a Quorum
Popular Algorithms
- VSR
- Paxos
- Raft
- Zab
Implementing one yourself is hard, just use one of these.
These algorithms decide on a sequence of values making them Total Order broadcast algorithms.
Drawbacks
- Require a strict majority
- Slow
- Assume a fixed set of nodes
- Rely on timeouts
- Sensitive to variable network delays and problems