The decoding problem for surface codes#
In the previous chapters, we observed the effect of different kinds of errors (\(X\), \(Y\), \(Z\)) on the data qubits of a surface code. These errors can occur during idling or when gates are applied. Also, measurement errors can happen on the auxiliary qubits during syndrome extraction.
After a round of syndrome extraction, we have a set of stabilizers whose measurement outcomes may have flipped. These flipped outcomes are known as syndrome defects. Recall that stabilizers only inform whether an error has occurred—not where it has occurred. Given this information, the primary job of the decoder is to figure out which errors could have led to the observed syndrome defects – with this information, we can apply the appropriate corrections to fix the error without introducing a logical error.
We have encountered a 1D version of the Minimum-Weight Perfect Matching (MWPM) decoder in our analysis of repetition codes. In that setting, the decoding problem was also 1D, as the generated syndromes are laid out in a 1D chain. It is important to mention that in realistic experiments, we repeatedly measure the stabilizers. The measurement errors between consecutive rounds create syndrome defects in time. Therefore, the decoding graph in the case of repetition codes becomes 2D: space (1D qubits) + time (1D). Similarly, in a realistic setting, the decoding problem of surface codes becomes 3D: space (2D qubits) + time (1D).
So, the actual decoding problem involves matching across space and time to correct for physical and measurement errors, respectively.
Mathematical description of the general minimum-weight perfect matching problem
Let \(G = \{V_{\textrm{G}},E_{\textrm{G}}, W_{\textrm{G}}\}\) be a weighted graph, where:
\(v_{i}\in V_{\textrm{G}}\) is the set of vertices.
\(e_{ij}\in E_{\textrm{G}}\) is the set of edges, such that \(i\ne j\).
\(w_{ij}\in W_{\textrm{G}}\) is the set of weights assigned to each edge.
Each vertex \(v \in V\) represents a detector node, and each edge \(e \in E\) represents an error mechanism that flips the set of detectors. Recall that the detectors correspond to the rows, and errors correspond to the columns of the check matrix \(H\).
If every vertex of the graph is incident on exactly one edge, it is called a perfect matching, \(M\). In other words, \(M \subset E_{\textrm{G}}\) such that each vertex is matched uniquely, i.e., no common endpoints.
The weight of \(M\) is : \(\sum_{e\in M} w_{e}\).
The minimum-weight perfect matching implies the smallest possible weight among all the possible perfect matchings.
MWPM for the surface code#
In the 2D grid layout of the surface code, a \(Z\) or \(X\) error on a data qubit lights up two syndromes (two nearby auxiliary qubits). Therefore, the matching graph contains two subgraphs: the \(X\)-matching graph and the \(Z\)-matching graph.
In each subgraph, the edges represent specific data qubits. The nodes represent associated check/syndrome qubits. We assume all data qubits undergo Pauli errors (\(X, Y, Z\)) with equal probability. However, for a depolarizing noise model, we can have \(Y\) errors, which can be decomposed into \(X\) and \(Z\) errors. This implies that a data qubit with a \(Y\) error would light up four syndromes, which corresponds to a hyperedge.
For the surface code, we have two parity check matrices: \(H_X\) and \(H_Z\) for \(X\)-checks and \(Z\)-checks, respectively. The columns of the parity check matrix correspond to data qubits, and the rows correspond to checks (stabilizers), such that \(H_{ij}=1\) if check \(i\) acts on data qubit \(j\), and zero otherwise. Then, the mathematical form of the decoding problem described above is as follows:
Consider an error vector \(\mathbf{e}\), such that an error on qubit \(j\) implies \(e_j = 1\). Then, \(\mathbf{s} = \mathbf{H} \cdot \mathbf{e}^T\). Solving this equation tells us which stabilizers \(\mathbf{s}\) flipped due to \(\mathbf{e}\).
Procedure for decoding the surface code#
The general procedure that we will follow for decoding surface codes with the MWPM algorithm is then:
Get the syndrome bit strings from the recent round of syndrome extraction circuit.
Collect all the non-trivial syndromes (auxiliary qubits that flipped since the last measurement round).
Build a weighted graph (decoding graph) where edges represent the possible error paths.
Find the smallest weight edges that connect syndrome pairs — MWPM.
Apply corrections corresponding to the most likely error configuration for each matched syndrome pair (in hardware or software).
Once the decoding algorithm maps the lit-up syndromes (stabilizer outcomes) to physical qubit errors, this information is then used to calculate the logical error probability, \(p_{\textrm{L}}\). Put simply, \(p_{\textrm{L}}\) is the rate at which the decoder fails to correct an error.
In our discussion on repetition codes, we have already seen this logical error probability plotted as a function of the physical error probability \(p\). For smaller values of \(p\), \(p_{\textrm{L}}\) is small and decreases as the code distance increases. For larger values of \(p\), \(p_{\textrm{L}}\) is larger and beyond a certain error rate continues to grow as the code distance increases.
This critical value of \(p\) where the effect of an increase in code distance transitions from reducing \(p_L\) to increasing it is known as the threshold value, \(p_{\textrm{th}}\).
For \(p\lt p_{\textrm{th}}\), \(p_{\textrm{L}} \sim p^{(d+1)/2}\).
Now that we have established the intuition behind the code threshold, we will proceed to simulate surface codes to measure the threshold error probability.
Additional reading on decoding surface codes#
Version History#
v0: Aug 14, 2025, github/@ESMatekole
v1: Sep 12, 2025, github/@aasfaw