Quantum
Module 8 · Noise and error correction · Lesson 4

The surface code

The error-correcting code that every major hardware team is actually trying to build. High thresholds, 2D layout, and a path to practical fault tolerance.

8 min read · Lesson 31 of 32

If you follow any of the big quantum computing roadmaps — Google’s, IBM’s, AWS’s, PsiQuantum’s — you’ll notice they all converge on one specific error-correcting code: the surface code. Despite having worse asymptotic overhead than some alternatives, the surface code has won the hardware race for three concrete reasons:

  1. Its error threshold is unusually high — around 1%1\% per gate — which is the easiest target for current physical qubits to beat.
  2. It requires only nearest-neighbor interactions on a 2D lattice, which is natural for chip-based architectures.
  3. Its syndrome measurements are simple enough to implement as repeated cycles without compounding errors too quickly.

If quantum error correction succeeds in bringing a fault-tolerant quantum computer into existence, the surface code is almost certainly how.

The idea, visually

Picture a 2D grid of physical qubits. The qubits sit on the edges of a square lattice. At each vertex of the lattice, you measure the product of XX operators on the four edges meeting at that vertex. At each face, you measure the product of ZZ operators on the four edges surrounding that face.

Those are the syndrome operators. They commute with each other and with the logical operators, so measuring them repeatedly does not disturb the logical qubit.

Errors on a single physical qubit flip the syndrome values at the two adjacent stabilizers. The pattern of flipped syndromes across the whole lattice forms an error graph. A classical decoder (typically a minimum-weight perfect matching algorithm) finds the most-likely chain of errors producing the observed syndromes and outputs a correction.

The key properties:

Scaling: how to get a better logical qubit

In the surface code, you make a logical qubit more reliable by making the lattice larger. A d×dd \times d lattice (where dd is the “code distance”) can correct any set of errors of weight up to (d1)/2\lfloor(d-1)/2\rfloor. Larger dd means:

To get a logical error rate of 101210^{-12} per gate (enough to run Shor on a 2048-bit integer), estimates suggest you need d25d \approx 253535, which means around 10001000 physical qubits per logical qubit. And you need several thousand logical qubits for the factoring algorithm to actually run. Total: roughly 20 million physical qubits.

That is, at time of writing, well beyond what any existing hardware can produce — current chips are in the 1000-qubit range, and most of those aren’t of high enough quality. But the roadmap is clear: every major player is targeting million-qubit chips over the next 5–15 years, and the surface code is what they plan to run on them.

Logical operations

The surface code supports logical gates via a combination of:

Magic state distillation is the main bottleneck. Current estimates suggest that 90% or more of the physical qubits in a fault-tolerant machine will be dedicated to producing magic states for TT-gate implementation. This is why “low T-count” is a headline metric for fault-tolerant quantum algorithms.

Where we actually are

As of 2023–2024, multiple labs have demonstrated:

None of these are “fault-tolerant quantum computers yet” in the sense of running Shor on a useful integer. But they are the first steps past the threshold, and they’re being built with the surface code (or close cousins) in mind.

Quick check
What's the error threshold of the surface code, and why is it important?
Quick check
Why is the surface code favored by hardware teams over theoretically better codes?
Quick check
What is 'magic state distillation'?

Module done — and where error correction sits

You now have the full arc: noise is real, single-qubit errors can be digitized into X and Z types, repetition codes handle one type, concatenated codes handle both, and the surface code is the practical workhorse. The quantum threshold theorem says scalable fault-tolerant computing is possible in principle. The engineering challenge is building it.

Module 9 briefly covers how real qubits are actually built — the physical substrates everything in this course is supposed to run on.