Randomized Algorithms
Randomized Algorithms
Randomized algorithms use random bits to achieve better expected performance or simpler analysis than deterministic algorithms. Las Vegas algorithms always give correct output but with random running time; Monte Carlo algorithms may give wrong output with small probability. The probabilistic method, Markov's inequality, and Chernoff bounds are the main analytical tools.
Randomized Algorithm Analysis
Markov's inequality: $P(X\geq a)\leq E[X]/a$ for non-negative $X$. Chebyshev: $P(|X-\mu|\geq k\sigma)\leq 1/k^2$. Chernoff bounds: for i.i.d. Bernoulli$(p)$ sum $X$, $\mu=np$: $P(X\geq(1+\delta)\mu)\leq e^{-\mu\delta^2/3}$ for $\delta\leq 1$; similarly for lower tail. These give exponentially small error probabilities after $O(\log(1/\delta))$ independent trials.
Randomized Quicksort
Randomized quicksort: choose pivot uniformly at random. Expected running time: $E[T(n)]=O(n\log n)$ regardless of input. Proof: define $X_{ij}=1$ if elements $i$ and $j$ (in sorted order) are ever compared. $E[\text{comparisons}]=\sum_{i
Example 1
Analyze the randomized min-cut algorithm (Karger's algorithm).
Solution: Karger's: repeatedly contract a random edge until 2 vertices remain — the two super-nodes define a cut. For a min-cut of size $k$: probability of success $\geq \binom{n}{2}^{-1}$ (any of the $\binom{n}{2}$ edge contractions preserving the min-cut). After $\Theta(n^2\log n)$ independent runs: failure probability $\leq 1/n$. Time: $O(n^2\log n)$. Karger-Stein improves to $O(n^2\log^3 n)$.
Example 2
Use the probabilistic method to show a graph with $m$ edges has a cut of size $\geq m/2$.
Solution: Assign each vertex to side $A$ or $B$ uniformly at random. Each edge $(u,v)$ is a cut-edge with probability $1/2$ (since $u$ and $v$ are independently assigned). $E[|\text{cut}|]=m/2$. By existence: there is a cut of size $\geq m/2$.
Practice
- Prove the Las Vegas/Monte Carlo distinction and show any Monte Carlo algorithm can be amplified.
- Analyze the birthday paradox: how many people needed before the probability of a shared birthday exceeds 1/2?
- Describe the Miller-Rabin primality test and analyze its error probability.
- Apply Chernoff bounds to prove that $O(\log n/\epsilon^2)$ coin flips suffice to estimate $p$ within $\epsilon$ with high probability.
Show Answer Key
1. Las Vegas: always correct, randomized running time (e.g., randomized quicksort — always sorts correctly, expected $O(n\log n)$). Monte Carlo: bounded running time, may err with bounded probability (e.g., Miller-Rabin — always finishes, may say 'prime' for a composite). Amplification: repeat a Monte Carlo algorithm $k$ times; error probability drops to $\delta^k$ (for one-sided) or via majority vote for two-sided: $P(\text{error}) \leq e^{-\Omega(k)}$ by Chernoff bound.
2. Birthday paradox: $P(\text{collision among } k$ people$) = 1 - \prod_{i=1}^{k-1}(1-i/365) \approx 1-e^{-k(k-1)/(2\cdot365)}$. Set $> 1/2$: $k(k-1)/2 > 365\ln 2 \approx 253$, giving $k \geq 23$. General: for $n$ possible values, collision expected at $k \approx \sqrt{2n\ln 2} = O(\sqrt{n})$.
3. Miller-Rabin: given $n$, write $n-1 = 2^s d$ ($d$ odd). Pick random $a$. Compute $a^d\bmod n$. If result is 1 or $n-1$, 'probably prime.' Otherwise, square up to $s-1$ times; if any result is $n-1$, probably prime. If none, 'composite.' For composite $n$, at most $1/4$ of bases $a$ are 'liars' (fail to detect). After $k$ independent rounds: error $\leq (1/4)^k$.
4. Chernoff bound: for $X = \sum X_i$ (independent Bernoullis with $E[X]=\mu$): $P(|X-\mu|\geq\epsilon\mu) \leq 2e^{-\mu\epsilon^2/3}$. To estimate $p$ within $\epsilon$ with confidence $1-\delta$: need $P(|\hat{p}-p|\geq\epsilon) \leq \delta$. With $n$ flips, $\mu = np$: $2e^{-np\epsilon^2/3} \leq \delta$ → $n \geq \frac{3\ln(2/\delta)}{p\epsilon^2}$. Since $p\geq \epsilon$ (otherwise trivial), $n = O(\ln(1/\delta)/\epsilon^2) = O(\log n / \epsilon^2)$ for $\delta = 1/n$.