Overview
Decoherence is the process by which a quantum system loses its “quantumness” through interaction with its environment.
A quantum system in isolation can exist in a superposition — a coherent combination of states, where interference between branches is possible (much like how wave amplitudes can add or cancel). The key word is coherent: the relative phases between components of the superposition are well-defined and physically meaningful.
When the system couples to a large, uncontrolled environment (air molecules, photons, etc.), those phase relationships effectively “leak out” into the environment’s enormous number of degrees of freedom. The off-diagonal terms in the density matrix — the ones encoding interference — decay rapidly toward zero. What remains is something that looks, statistically, like a classical probability distribution over definite outcomes.
Importantly, decoherence explains why we don’t see superpositions at macroscopic scales (it happens extraordinarily fast for large objects), and it picks out a preferred basis of states that are robust to environmental interaction (typically spatially localized states). But it does not, by itself, solve the measurement problem — you still end up with a classical mixture of outcomes and need some further interpretive story (collapse, many-worlds, etc.) to account for why you see one particular result.
A Simple Example
Setup. The system starts in a superposition:
\[ |\psi_S\rangle = \alpha|0\rangle + \beta|1\rangle \]
and the environment starts in some ready state \(|E_0\rangle\). The total state is initially a product:
\[ |\Psi\rangle = (\alpha|0\rangle + \beta|1\rangle) \otimes |E_0\rangle \]
Interaction. Now suppose the environment “measures” the system — meaning the interaction entangles them:
\[ \alpha|0\rangle|E_0\rangle + \beta|1\rangle|E_0\rangle \;\longrightarrow\; \alpha|0\rangle|E_0\rangle + \beta|1\rangle|E_1\rangle \]
The environment has shifted to a different state \(|E_1\rangle\) conditionally on the system being in \(|1\rangle\). The total state is now entangled; the system alone is no longer in a pure state.
Tracing out the environment. Since we don’t have access to the environment, we compute the reduced density matrix of the system by taking the partial trace:
\[ \rho_S = |\alpha|^2 |0\rangle\langle 0| + |\beta|^2 |1\rangle\langle 1| + \alpha\beta^* \langle E_1|E_0\rangle\, |0\rangle\langle 1| + \alpha^*\beta \langle E_0|E_1\rangle\, |1\rangle\langle 0| \]
Everything hinges on the overlap \(\langle E_1 | E_0 \rangle\):
The realistic case. In practice, the “environment” isn’t one qubit but an enormous number of degrees of freedom, and each one carries away a little bit of which-state information. The cumulative overlap \(\langle E_1|E_0\rangle\) drops to zero exponentially fast (often on timescales of \(10^{-20}\) seconds or less for macroscopic objects). So coherence doesn’t vanish in one sharp step — it decays, but incredibly rapidly.
That’s the essence: entanglement with the environment makes the system’s interference terms inaccessible, not by destroying them globally, but by delocalizing them into correlations you’ll never practically recover.
The Partial Trace
Given a composite Hilbert space \(\mathcal{H}_A \otimes \mathcal{H}_B\) and an operator \(\rho\) on it, the partial trace over \(B\) is the unique operator \(\mathrm{Tr}_B(\rho)\) on \(\mathcal{H}_A\) satisfying
\[ \mathrm{Tr}\big((\hat{O}_A \otimes I_B)\,\rho\big) = \mathrm{Tr}\big(\hat{O}_A\,\mathrm{Tr}_B(\rho)\big) \]
for all operators \(\hat{O}_A\) on \(\mathcal{H}_A\). In other words, it’s defined by the requirement that it reproduces the expectation value of every observable that acts only on \(A\).
Concretely, if \(\{|j\rangle\}\) is any orthonormal basis for \(\mathcal{H}_B\), then
\[ \mathrm{Tr}_B(\rho) = \sum_j (I_A \otimes \langle j|)\,\rho\,(I_A \otimes |j\rangle) \]
So for a pure state \(|\Psi\rangle = \sum_{i,j} c_{ij}|a_i\rangle|b_j\rangle\), you get
\[ \mathrm{Tr}_B(|\Psi\rangle\langle\Psi|) = \sum_j \Big(\sum_i c_{ij}|a_i\rangle\Big)\Big(\sum_k c_{kj}^*\langle a_k|\Big) \]
which is just summing over the “slices” of the coefficient matrix. If you think of the \(c_{ij}\) as a matrix \(C\), then \(\mathrm{Tr}_B(|\Psi\rangle\langle\Psi|) = CC^\dagger\) — which makes it easy to see, for instance, that the reduced state is pure if and only if \(C\) has rank 1 (i.e., the state is unentangled).
The characterization via the universal property in the first equation is the cleaner way to think about it: the partial trace is the unique linear map that “marginalizes out” subsystem \(B\) while preserving all statistics on \(A\).
Pure vs. Entangled
A state \(|\Psi\rangle = \sum_{i,j} c_{ij}|a_i\rangle|b_j\rangle\) is pure — it is a single ket, i.e., a unit vector in \(\mathcal{H}_A \otimes \mathcal{H}_B\). The set \(\{|a_i\rangle|b_j\rangle\}\) is an orthonormal basis for the tensor product space, and the double sum is just its expansion in that basis. No different in principle from writing \(|\psi\rangle = \sum_n c_n |n\rangle\) for a single system.
The key distinction: a mixed state is a convex combination of projectors (a density matrix \(\rho = \sum_k p_k |\phi_k\rangle\langle\phi_k|\)), whereas a superposition like the above is a coherent sum of basis vectors — a single element of the Hilbert space.
The state is entangled if \(C = (c_{ij})\) has rank \(> 1\) (i.e., it can’t be written as a single product \(|a\rangle \otimes |b\rangle\)), but entangled and mixed are different things. Entangled and pure coexist perfectly well — it is precisely the situation where the global state is pure but the reduced states (after partial trace) are mixed.
Computing the Partial Trace \(\rho_S\) in Detail
We start from the post-interaction state:
\[ |\Psi\rangle = \alpha|0\rangle|E_0\rangle + \beta|1\rangle|E_1\rangle \]
Step 1: Form the full density matrix \(|\Psi\rangle\langle\Psi|\).
\[ |\Psi\rangle\langle\Psi| = \big(\alpha|0\rangle|E_0\rangle + \beta|1\rangle|E_1\rangle\big)\big(\alpha^*\langle 0|\langle E_0| + \beta^*\langle 1|\langle E_1|\big) \]
Expanding the four terms:
\[ = |\alpha|^2\,|0\rangle\langle 0| \otimes |E_0\rangle\langle E_0| \;+\; \alpha\beta^*\,|0\rangle\langle 1| \otimes |E_0\rangle\langle E_1| \;+\; \alpha^*\beta\,|1\rangle\langle 0| \otimes |E_1\rangle\langle E_0| \;+\; |\beta|^2\,|1\rangle\langle 1| \otimes |E_1\rangle\langle E_1| \]
Step 2: Apply the partial trace over \(E\). The key identity is that for elementary tensors the partial trace collapses the environment factor to an inner product:
\[ \mathrm{Tr}_E\big(|a\rangle\langle b| \otimes |c\rangle\langle d|\big) = \langle d|c\rangle\;|a\rangle\langle b| \]
This follows directly from the definition: pick any orthonormal basis \(\{|j\rangle\}\) for \(\mathcal{H}_E\) and compute
\[ \sum_j \langle j|c\rangle\langle d|j\rangle = \langle d|c\rangle \]
Step 3: Apply term by term.
\[ \mathrm{Tr}_E\big(|\alpha|^2\,|0\rangle\langle 0| \otimes |E_0\rangle\langle E_0|\big) = |\alpha|^2\,\langle E_0|E_0\rangle\;|0\rangle\langle 0| = |\alpha|^2\,|0\rangle\langle 0| \]
\[ \mathrm{Tr}_E\big(\alpha\beta^*\,|0\rangle\langle 1| \otimes |E_0\rangle\langle E_1|\big) = \alpha\beta^*\,\langle E_1|E_0\rangle\;|0\rangle\langle 1| \]
\[ \mathrm{Tr}_E\big(\alpha^*\beta\,|1\rangle\langle 0| \otimes |E_1\rangle\langle E_0|\big) = \alpha^*\beta\,\langle E_0|E_1\rangle\;|1\rangle\langle 0| \]
\[ \mathrm{Tr}_E\big(|\beta|^2\,|1\rangle\langle 1| \otimes |E_1\rangle\langle E_1|\big) = |\beta|^2\,\langle E_1|E_1\rangle\;|1\rangle\langle 1| = |\beta|^2\,|1\rangle\langle 1| \]
where the diagonal terms simplified using \(\langle E_0|E_0\rangle = \langle E_1|E_1\rangle = 1\).
Step 4: Collect.
\[ \rho_S = |\alpha|^2\,|0\rangle\langle 0| + |\beta|^2\,|1\rangle\langle 1| + \alpha\beta^*\langle E_1|E_0\rangle\,|0\rangle\langle 1| + \alpha^*\beta\langle E_0|E_1\rangle\,|1\rangle\langle 0| \]
Note that the off-diagonal terms are suppressed by the factor \(\langle E_1|E_0\rangle\) (and its conjugate). That single inner product is doing all the work: it quantifies how distinguishable the environment states are, and hence how much coherence survives in the system.
Why Decoherence Occurs Exponentially Fast
The single-qubit environment example gives decoherence controlled by one overlap \(\langle E_1|E_0\rangle\). Now suppose the system interacts sequentially (or simultaneously) with \(N\) independent environment qubits, each initially in state \(|e_0\rangle\), each coupling weakly to the system in the same way:
\[ |0\rangle|e_0\rangle \to |0\rangle|e_0\rangle, \qquad |1\rangle|e_0\rangle \to |1\rangle|e_1\rangle \]
After interacting with all \(N\) environment qubits, the total state is
\[ |\Psi\rangle = \alpha|0\rangle|e_0\rangle^{\otimes N} + \beta|1\rangle|e_1\rangle^{\otimes N} \]
Now when you compute \(\rho_S\) by tracing out the entire environment, the off-diagonal terms pick up the factor
\[ \langle e_1|e_0\rangle^N \]
because the environment is a tensor product of independent qubits, so the total overlap factors. If each individual interaction is very weak — say \(\langle e_1|e_0\rangle = 1 - \epsilon\) for some small \(\epsilon\) — then each qubit barely disturbs the coherence on its own. But the cumulative suppression factor is
\[ (1 - \epsilon)^N \approx e^{-\epsilon N} \]
which decays exponentially in \(N\). Since a macroscopic environment has \(N \sim 10^{23}\) or more degrees of freedom, even an absurdly tiny per-particle coupling (\(\epsilon \sim 10^{-20}\), say) produces a suppression factor like \(e^{-10^3}\) — effectively zero.
That’s the mechanism: each environmental degree of freedom independently “learns” a tiny amount about which branch the system is in. No single one destroys coherence, but the cumulative effect of all of them — multiplicative because the subsystems are independent — gives exponential suppression. The timescale for decoherence is roughly \(\tau_D \sim \tau_{\mathrm{int}} / N\), where \(\tau_{\mathrm{int}}\) is the timescale of a single interaction, which for macroscopic objects in air at room temperature works out to something like \(10^{-30}\) seconds or less.