Mathematical Foundations: Hilbert Spaces, Bra-Ket Notation, and Linear Operators
Mathematical Foundations: Hilbert Spaces, Bra-Ket Notation, and Linear Operators
Quantum mechanics is, at its core, a mathematical framework built upon the theory of infinite-dimensional complex vector spaces equipped with an inner product. Unlike classical mechanics, which operates on a phase space of positions and momenta, quantum mechanics assigns to every physical system a Hilbert space $\mathcal{H}$ whose elements — called state vectors — encode all possible information about the system. Understanding the precise mathematical structure of these spaces is not merely a formality; it is the foundation upon which every quantum prediction rests. Without this rigor, the formalism collapses into a collection of mysterious recipes whose domain of validity and internal consistency cannot be assessed.
The decision to use complex vector spaces, rather than real ones, is not arbitrary. Complex amplitudes allow for the phenomenon of quantum interference — the constructive and destructive combination of probability amplitudes — which has no classical analogue. The inner product structure provides a natural notion of probability: the squared modulus of an inner product between a state vector and an eigenstate gives the probability of measuring the corresponding eigenvalue. This is the Born rule, and its consistent derivation from the structure of Hilbert spaces (via Gleason's theorem) reveals that quantum probability theory is uniquely determined by the geometry of the underlying vector space.
Linear operators on Hilbert spaces represent physical observables. The requirement that measured values be real numbers forces observables to be represented by Hermitian (self-adjoint) operators, whose eigenvalues are guaranteed to be real. The spectral theorem — one of the deepest results in functional analysis — guarantees that every self-adjoint operator has a complete, orthonormal set of eigenstates, allowing any state to be expanded in terms of these eigenstates. This expansion is precisely the mechanism by which quantum mechanics assigns probabilities to measurement outcomes.
Dirac's bra-ket notation provides an elegant and coordinate-free language for these structures. A ket $|\psi\rangle$ represents a state vector in $\mathcal{H}$; a bra $\langle\phi|$ represents a linear functional on $\mathcal{H}$ — an element of the dual space $\mathcal{H}^*$. The inner product $\langle\phi|\psi\rangle$ bridges these two spaces. By the Riesz representation theorem, there is a canonical anti-linear bijection between $\mathcal{H}$ and $\mathcal{H}^*$, so every bra corresponds to exactly one ket and vice versa. This correspondence is what Dirac exploited to build a notation that makes the algebra of operators nearly mechanical.
Commutators of operators play a central role in quantum mechanics. The commutator $[\hat{A}, \hat{B}] = \hat{A}\hat{B} - \hat{B}\hat{A}$ measures the failure of two operators to commute, and this failure has profound physical consequences: non-commuting observables cannot be simultaneously measured with arbitrary precision, leading directly to the uncertainty principle. The commutator algebra also governs the time evolution of observables in the Heisenberg picture, and determines which constants of motion exist — a direct analogue of Poisson brackets in classical mechanics, to which it reduces in the $\hbar \to 0$ limit.
The completeness relation $\sum_n |n\rangle\langle n| = \hat{\mathbf{1}}$ (or $\int |x\rangle\langle x|\,dx = \hat{\mathbf{1}}$ in the continuous case) is the resolution of the identity, and it is arguably the most frequently used tool in quantum calculations. Inserting the identity in various bases allows one to transform between representations — from position to momentum space, from energy eigenstates to coherent states — with ease. Tensor products of Hilbert spaces describe composite systems and are the mathematical setting for entanglement, one of the most distinctive and practically important features of quantum mechanics.
The spectral theorem for unbounded operators — which is the relevant case for position $\hat{x}$ and momentum $\hat{p}$ — requires more care than its finite-dimensional counterpart. One must work with spectral measures, projection-valued measures, and the domain of the operator. For a physicist, the practical upshot is that position and momentum eigenstates $|x\rangle$ and $|p\rangle$ are not normalizable in the conventional sense (they live in a rigged Hilbert space, or Gel'fand triple), but can be treated as a continuous basis satisfying $\langle x|x'\rangle = \delta(x-x')$ and $\langle p|p'\rangle = \delta(p-p')$. This level of mathematical sophistication is essential for anyone intending to work seriously in quantum field theory or mathematical physics.
This lesson develops all of these concepts systematically, beginning with the axiomatic definition of a Hilbert space, proceeding through the mechanics of bra-ket notation, the algebra of linear operators, and culminating in the spectral theorem and tensor products. Every step is accompanied by concrete examples drawn from the simplest quantum systems — two-level systems (qubits) and the harmonic oscillator — to ground the abstraction in physical intuition.
A Hilbert space $\mathcal{H}$ is a complete inner product space over $\mathbb{C}$. Specifically, it is a complex vector space equipped with an inner product $\langle \cdot, \cdot \rangle : \mathcal{H} \times \mathcal{H} \to \mathbb{C}$ satisfying: (1) conjugate symmetry $\langle\phi|\psi\rangle = \overline{\langle\psi|\phi\rangle}$, (2) linearity in the second argument $\langle\phi|\alpha\psi_1 + \beta\psi_2\rangle = \alpha\langle\phi|\psi_1\rangle + \beta\langle\phi|\psi_2\rangle$, (3) positive definiteness $\langle\psi|\psi\rangle \geq 0$ with equality iff $|\psi\rangle = 0$. Completeness means every Cauchy sequence in $\mathcal{H}$ converges to an element of $\mathcal{H}$ under the norm $\|\psi\| = \sqrt{\langle\psi|\psi\rangle}$. The archetypal example is $L^2(\mathbb{R})$, the space of square-integrable functions with $\langle f|g\rangle = \int_{-\infty}^{\infty} f^*(x)g(x)\,dx$.
In Dirac notation, a state vector is written as a ket $|\psi\rangle \in \mathcal{H}$. Its dual vector (a continuous linear functional on $\mathcal{H}$) is a bra $\langle\psi| \in \mathcal{H}^*$. The inner product is written $\langle\phi|\psi\rangle$, a braket. The outer product $|\psi\rangle\langle\phi|$ is a rank-one operator on $\mathcal{H}$. For an orthonormal basis $\{|e_n\rangle\}$, completeness reads $\sum_n |e_n\rangle\langle e_n| = \hat{\mathbf{1}}$. Any state can be expanded as $|\psi\rangle = \sum_n c_n |e_n\rangle$ where $c_n = \langle e_n|\psi\rangle$. For a qubit in the computational basis $\{|0\rangle, |1\rangle\}$: $|\psi\rangle = \alpha|0\rangle + \beta|1\rangle$ with $|\alpha|^2 + |\beta|^2 = 1$.
An operator $\hat{A}: \mathcal{H} \to \mathcal{H}$ is Hermitian (formally self-adjoint) if $\langle\phi|\hat{A}\psi\rangle = \langle\hat{A}\phi|\psi\rangle$ for all $|\phi\rangle, |\psi\rangle \in \mathcal{H}$, equivalently $\hat{A} = \hat{A}^\dagger$, where $\hat{A}^\dagger$ is the adjoint defined by $\langle\phi|\hat{A}^\dagger\psi\rangle = \overline{\langle\psi|\hat{A}\phi\rangle}$. In finite dimensions, $\hat{A}^\dagger = (\hat{A}^T)^*$ (conjugate transpose). Key consequences: all eigenvalues are real, eigenvectors corresponding to distinct eigenvalues are orthogonal, and the eigenvectors form a complete basis for $\mathcal{H}$. In matrix form for a qubit, the Pauli matrices $\hat{\sigma}_x = \begin{pmatrix}0&1\\1&0\end{pmatrix}$, $\hat{\sigma}_y = \begin{pmatrix}0&-i\\i&0\end{pmatrix}$, $\hat{\sigma}_z = \begin{pmatrix}1&0\\0&-1\end{pmatrix}$ are all Hermitian.
Let $\hat{A}$ be a self-adjoint operator on a separable Hilbert space $\mathcal{H}$. Then there exists a unique projection-valued measure (PVM) $E_A$ on $\mathbb{R}$ such that $\hat{A} = \int_{\mathbb{R}} \lambda\, dE_A(\lambda)$. In the discrete case (e.g., bounded $\hat{A}$ with discrete spectrum), this reduces to $\hat{A} = \sum_n a_n |a_n\rangle\langle a_n|$ where $\{|a_n\rangle\}$ are orthonormal eigenstates and $\{a_n\}$ are the corresponding real eigenvalues. The probability of measuring eigenvalue $a_n$ in state $|\psi\rangle$ is $P(a_n) = |\langle a_n|\psi\rangle|^2$ (Born rule). The post-measurement state is $|a_n\rangle$ (von Neumann projection postulate). For the continuous case, $P(a \in [\lambda_1,\lambda_2]) = \langle\psi|E_A([\lambda_1,\lambda_2])|\psi\rangle$.
The position and momentum operators in one dimension satisfy $[\hat{x}, \hat{p}] = i\hbar\hat{\mathbf{1}}$, where $\hat{x}$ acts as multiplication by $x$ in position representation and $\hat{p} = -i\hbar\partial/\partial x$. This relation is the quantum-mechanical analogue of the Poisson bracket $\{x, p\} = 1$. By the Stone-von Neumann theorem, up to unitary equivalence, the Schrödinger representation is the unique irreducible representation of the Weyl-Heisenberg algebra on $L^2(\mathbb{R})$. For angular momentum components: $[\hat{L}_x, \hat{L}_y] = i\hbar\hat{L}_z$, and cyclically. The commutator of two observables $\hat{A}$ and $\hat{B}$ is anti-Hermitian: $(i[\hat{A},\hat{B}])^\dagger = i[\hat{A},\hat{B}]$, so $[\hat{A},\hat{B}]/(i\hbar)$ is Hermitian.
A qubit is prepared in the state $|\psi\rangle = \cos(\theta/2)|0\rangle + e^{i\phi}\sin(\theta/2)|1\rangle$ on the Bloch sphere. Compute: (a) the normalization, (b) $\langle\hat{\sigma}_z\rangle$, (c) $\langle\hat{\sigma}_x\rangle$, (d) the probability of measuring spin-up along $\hat{z}$.
(a) $\|\psi\|^2 = \cos^2(\theta/2) + \sin^2(\theta/2) = 1$. The state is normalized for all $\theta, \phi$. (b) $\langle\hat{\sigma}_z\rangle = \langle\psi|\hat{\sigma}_z|\psi\rangle$. Since $\hat{\sigma}_z|0\rangle = |0\rangle$ and $\hat{\sigma}_z|1\rangle = -|1\rangle$: $$\langle\hat{\sigma}_z\rangle = \cos^2(\theta/2)(1) + \sin^2(\theta/2)(-1) = \cos^2(\theta/2) - \sin^2(\theta/2) = \cos\theta.$$ (c) $\hat{\sigma}_x|0\rangle = |1\rangle$, $\hat{\sigma}_x|1\rangle = |0\rangle$, so: $$\langle\hat{\sigma}_x\rangle = \cos(\theta/2)\sin(\theta/2)e^{-i\phi} + e^{i\phi}\sin(\theta/2)\cos(\theta/2) = \sin\theta\cos\phi.$$ Similarly $\langle\hat{\sigma}_y\rangle = \sin\theta\sin\phi$. The Bloch vector $(\langle\sigma_x\rangle, \langle\sigma_y\rangle, \langle\sigma_z\rangle) = (\sin\theta\cos\phi, \sin\theta\sin\phi, \cos\theta)$ is a unit vector on the Bloch sphere. (d) $P(+z) = |\langle 0|\psi\rangle|^2 = \cos^2(\theta/2)$.
Prove that if $\hat{A}$ is Hermitian and $\hat{U} = e^{i\hat{A}}$, then $\hat{U}$ is unitary. Use this to show that time evolution $\hat{U}(t) = e^{-i\hat{H}t/\hbar}$ preserves normalization.
The operator exponential is defined by $e^{i\hat{A}} = \sum_{n=0}^\infty \frac{(i\hat{A})^n}{n!}$. The adjoint is $(e^{i\hat{A}})^\dagger = \sum_{n=0}^\infty \frac{(-i\hat{A}^\dagger)^n}{n!} = e^{-i\hat{A}^\dagger} = e^{-i\hat{A}}$ (using $\hat{A}^\dagger = \hat{A}$). Since $\hat{A}$ commutes with itself, $e^{i\hat{A}}e^{-i\hat{A}} = e^{i\hat{A} - i\hat{A}} = e^0 = \hat{\mathbf{1}}$. Hence $\hat{U}^\dagger\hat{U} = \hat{U}\hat{U}^\dagger = \hat{\mathbf{1}}$, so $\hat{U}$ is unitary. For time evolution with Hermitian $\hat{H}$: $\frac{d}{dt}\langle\psi(t)|\psi(t)\rangle = \langle\dot{\psi}|\psi\rangle + \langle\psi|\dot{\psi}\rangle = \frac{i}{\hbar}(\langle\psi|\hat{H}^\dagger|\psi\rangle - \langle\psi|\hat{H}|\psi\rangle) = 0$, confirming probability conservation.
Given two operators $\hat{A}$ and $\hat{B}$ with $[\hat{A}, \hat{B}] = c\hat{\mathbf{1}}$ (a $c$-number), prove: (a) $[\hat{A}, \hat{B}^n] = nc\hat{B}^{n-1}$, (b) $[\hat{A}, f(\hat{B})] = c\,f'(\hat{B})$ for analytic $f$, (c) the Baker-Campbell-Hausdorff lemma $e^{\hat{A}}\hat{B}e^{-\hat{A}} = \hat{B} + [\hat{A},\hat{B}]$ when $[\hat{A},[\hat{A},\hat{B}]] = 0$.
(a) By induction: $[\hat{A}, \hat{B}^n] = [\hat{A}, \hat{B}]\hat{B}^{n-1} + \hat{B}[\hat{A}, \hat{B}^{n-1}] = c\hat{B}^{n-1} + \hat{B}\cdot(n-1)c\hat{B}^{n-2} = nc\hat{B}^{n-1}$, using $[\hat{A},\hat{B}]$ commutes with $\hat{B}$ (since it's a scalar). (b) Follows directly from (a): $[\hat{A}, f(\hat{B})] = [\hat{A}, \sum_n f_n \hat{B}^n] = \sum_n f_n nc\hat{B}^{n-1} = c f'(\hat{B})$. (c) Define $F(s) = e^{s\hat{A}}\hat{B}e^{-s\hat{A}}$. Then $F'(s) = e^{s\hat{A}}[\hat{A},\hat{B}]e^{-s\hat{A}} = c\cdot e^{s\hat{A}}e^{-s\hat{A}} = c$ (since $[\hat{A},[\hat{A},\hat{B}]]=0$ implies $[\hat{A},c]=0$). Hence $F(s) = F(0) + cs = \hat{B} + cs$. Setting $s=1$: $e^{\hat{A}}\hat{B}e^{-\hat{A}} = \hat{B} + c = \hat{B} + [\hat{A},\hat{B}]$. Applied to $\hat{x}$ and $\hat{p}$: $e^{ia\hat{x}/\hbar}\hat{p}e^{-ia\hat{x}/\hbar} = \hat{p} + a\hat{\mathbf{1}}$ — a momentum translation operator.
Construct the tensor product space for two qubits. Write the Bell states, verify their entanglement, and compute the reduced density matrix $\rho_A = \text{Tr}_B(|\Phi^+\rangle\langle\Phi^+|)$.
The two-qubit Hilbert space is $\mathcal{H} = \mathcal{H}_A \otimes \mathcal{H}_B$ with basis $\{|00\rangle, |01\rangle, |10\rangle, |11\rangle\}$ (shorthand for $|0\rangle_A\otimes|0\rangle_B$, etc.). The four Bell states are: $$|\Phi^\pm\rangle = \frac{|00\rangle \pm |11\rangle}{\sqrt{2}}, \quad |\Psi^\pm\rangle = \frac{|01\rangle \pm |10\rangle}{\sqrt{2}}.$$ They are maximally entangled: no state $|\psi\rangle_A\otimes|\phi\rangle_B$ equals $|\Phi^+\rangle$ (Schmidt rank 2). The density matrix $\rho = |\Phi^+\rangle\langle\Phi^+| = \frac{1}{2}(|00\rangle\langle00| + |00\rangle\langle11| + |11\rangle\langle00| + |11\rangle\langle11|)$. The partial trace over $B$: $\rho_A = \text{Tr}_B(\rho) = \sum_{j\in\{0,1\}} {}_B\langle j|\rho|j\rangle_B = \frac{1}{2}|0\rangle\langle0| + \frac{1}{2}|1\rangle\langle1| = \frac{\hat{\mathbf{1}}}{2}$. This maximally mixed state confirms maximal entanglement: $S(\rho_A) = -\text{Tr}(\rho_A\log\rho_A) = \log 2$ bits.
Practice Problems
Show Answer Key
1. A $2\times2$ Hermitian matrix has the form $\begin{pmatrix}a & b+ic \\ b-ic & d\end{pmatrix}$ with $a,b,c,d \in \mathbb{R}$ — four real parameters. Basis: $\{\hat{\mathbf{1}}, \hat{\sigma}_x, \hat{\sigma}_y, \hat{\sigma}_z\}$ (identity plus three Pauli matrices). Real linear combinations of these span all $2\times2$ Hermitian matrices.
2. Let $\hat{A}|a\rangle = a|a\rangle$ and $\hat{A}|b\rangle = b|b\rangle$ with $a \neq b$. Then $\langle b|\hat{A}|a\rangle = a\langle b|a\rangle$ and $\langle b|\hat{A}|a\rangle = \langle\hat{A}^\dagger b|a\rangle = b^*\langle b|a\rangle = b\langle b|a\rangle$ (since $b$ is real for Hermitian $\hat{A}$). So $(a-b)\langle b|a\rangle = 0$, and since $a \neq b$, $\langle b|a\rangle = 0$.
3. $\langle\hat{\sigma}_z\rangle = \frac{1}{3} - \frac{2}{3} = -\frac{1}{3}$. $\langle\hat{\sigma}_x\rangle = 2\text{Re}(c_0^*c_1) = 2\cdot\frac{1}{\sqrt{3}}\cdot\sqrt{\frac{2}{3}} = \frac{2\sqrt{2}}{3}$. $\langle\hat{\sigma}_y\rangle = 0$ (both coefficients real). Bloch magnitude: $\frac{8}{9} + 0 + \frac{1}{9} = 1$. ✓
4. $[\hat{x}^2, \hat{p}] = \hat{x}[\hat{x},\hat{p}] + [\hat{x},\hat{p}]\hat{x} = \hat{x}(i\hbar) + (i\hbar)\hat{x} = 2i\hbar\hat{x}$.
5. $\langle\hat{H}\rangle = \sum_{m,n}c_m^*c_n E_n\langle E_m|E_n\rangle = \sum_{m,n}c_m^*c_n E_n\delta_{mn} = \sum_n|c_n|^2 E_n$. Similarly $\langle\hat{H}^2\rangle = \sum_n|c_n|^2E_n^2$ since $\hat{H}^2|E_n\rangle = E_n^2|E_n\rangle$.
6. For any $|\psi\rangle$: $\langle\psi|\hat{A}\hat{A}^\dagger|\psi\rangle = \langle\hat{A}^\dagger\psi|\hat{A}^\dagger\psi\rangle = \|\hat{A}^\dagger|\psi\rangle\|^2 \geq 0$ by positive definiteness of the inner product. Since this holds for all $|\psi\rangle$, all eigenvalues of $\hat{A}\hat{A}^\dagger$ are non-negative.
7. Apply $\int|x\rangle\langle x|dx$ to arbitrary $|\psi\rangle$: $\int|x\rangle\langle x|\psi\rangle dx = \int\psi(x)|x\rangle dx = |\psi\rangle$. For $\langle x|p\rangle$: from $\hat{p}|p\rangle = p|p\rangle$ and $\hat{p} = -i\hbar\partial_x$ in position rep, $-i\hbar\partial_x\langle x|p\rangle = p\langle x|p\rangle$, giving $\langle x|p\rangle = Ce^{ipx/\hbar}$. Normalization $\langle p'|p\rangle = \delta(p-p')$ fixes $C = (2\pi\hbar)^{-1/2}$.
8. Write $\rho = \sum_n p_n|n\rangle\langle n|$ with $p_n \geq 0$, $\sum_n p_n = 1$. Then $\text{Tr}(\rho^2) = \sum_n p_n^2 \leq (\sum_n p_n)^2 = 1$ by the convexity inequality, with equality iff only one $p_n = 1$ (pure state).
9. $\langle m|\hat{x}|n\rangle = \sqrt{\hbar/(2m\omega)}\langle m|(\hat{a}+\hat{a}^\dagger)|n\rangle = \sqrt{\hbar/(2m\omega)}(\sqrt{n}\delta_{m,n-1}+\sqrt{n+1}\delta_{m,n+1})$. Non-zero only for $m = n \pm 1$.
10. $\hat{H} = \begin{pmatrix}\epsilon & \Delta \\ \Delta & -\epsilon\end{pmatrix}$. Eigenvalues: $E_\pm = \pm\sqrt{\epsilon^2+\Delta^2}$. Eigenstates: $|E_+\rangle = \cos(\alpha/2)|0\rangle + \sin(\alpha/2)|1\rangle$, $|E_-\rangle = -\sin(\alpha/2)|0\rangle + \cos(\alpha/2)|1\rangle$ where $\tan\alpha = \Delta/\epsilon$.
11. Expand each double commutator: $[\hat{A},[\hat{B},\hat{C}]] = \hat{A}\hat{B}\hat{C} - \hat{A}\hat{C}\hat{B} - \hat{B}\hat{C}\hat{A} + \hat{C}\hat{B}\hat{A}$. Summing all three cyclic permutations, each of the 12 terms appears exactly twice with opposite signs, giving zero.
12. Eigenvalues of $\rho$ are $p$ and $1-p$. $S = -p\ln p - (1-p)\ln(1-p)$ (binary entropy). Maximum at $p = 1/2$: $S_{\max} = \ln 2 \approx 0.693$ nats $= 1$ bit.