NP-Completeness & Complexity Theory
NP-Completeness & Complexity Theory
Complexity theory classifies problems by their inherent computational difficulty. The P vs NP question — whether every efficiently verifiable problem is efficiently solvable — is the deepest open problem in computer science. NP-complete problems are the hardest in NP; a polynomial-time algorithm for any of them would imply P=NP. Reductions show that NP-hard problems are at least as hard as any problem in NP.
P, NP & Polynomial Reductions
P: decision problems solvable in polynomial time. NP: problems where a 'yes' certificate is verifiable in polynomial time. NP-complete (NPC): $L$ is NP-hard (every NP problem reduces to $L$ in poly time) and $L\in NP$. Cook-Levin theorem: SAT (Boolean satisfiability) is NP-complete. Reduction $A\leq_p B$: $A$ poly-time reduces to $B$ — solving $B$ solves $A$. If $B\in P$ and $A\leq_p B$, then $A\in P$.
NP-Completeness via Reductions
NPC problems (a sample): SAT, 3-SAT, Independent Set, Vertex Cover, Clique, Hamiltonian Cycle, Traveling Salesman, 3-Coloring, Subset Sum. Chain: Cook-Levin (SAT is NPC) $\Rightarrow$ 3-SAT (subcase) $\Rightarrow$ Independent Set $\Rightarrow$ Clique (complement), Vertex Cover. All NPC problems are equivalent under poly-time reductions: solve one efficiently, solve all. No polynomial algorithm is known for any NPC problem.
Example 1
Show Independent Set reduces to Clique.
Solution: $L_{IS}$: does $G$ have an independent set of size $k$? Map $(G,k)\to(\bar{G},k)$ where $\bar{G}$ is the complement graph. $S$ is an independent set in $G$ iff $S$ is a clique in $\bar{G}$ (no edge between $S$ in $G$ = all edges between $S$ in $\bar{G}$). Reduction takes $O(n^2)$ time. So $L_{IS}\leq_p L_{\text{Clique}}$ and vice versa — they are equivalent.
Example 2
Why doesn't finding a solution in exponential time prove P=NP?
Solution: P vs NP is about worst-case polynomial vs exponential time. An $O(2^n)$ algorithm is fine for proving a problem is decidable, but P requires polynomial-time algorithms. The question is whether any NPC problem has a poly-time algorithm. No one has proved P≠NP (separation), nor found a poly algorithm for any NPC problem — it remains open since 1971.
Practice
- Prove 3-SAT is NP-complete by reducing SAT to 3-SAT.
- Show that if NP=coNP then the polynomial hierarchy collapses.
- Describe approximation algorithms: give a 2-approximation for Vertex Cover.
- Explain why randomized algorithms (BPP) are believed to equal P in power, and describe the role of derandomization.
Show Answer Key
1. SAT → 3-SAT: for each clause with $k>3$ literals $l_1\vee\ldots\vee l_k$, introduce auxiliary variables $y_1,\ldots,y_{k-3}$ and replace with $(l_1\vee l_2\vee y_1)\wedge(\bar{y}_1\vee l_3\vee y_2)\wedge\ldots\wedge(\bar{y}_{k-3}\vee l_{k-1}\vee l_k)$. Each clause has exactly 3 literals. The transformation is polynomial and preserves satisfiability. Since SAT is NP-complete (Cook-Levin), 3-SAT is too.
2. If NP = coNP: then coNP ⊆ NP, so $\Sigma_1^P = \Pi_1^P$. By induction, $\Sigma_k^P = \Pi_k^P$ for all $k$, so PH $= \Sigma_1^P = $ NP. The polynomial hierarchy collapses to the first level. This is considered unlikely, as it would mean every problem whose solution can be verified in polynomial time also has short proofs of non-membership.
3. Vertex Cover 2-approx: for each uncovered edge $(u,v)$, add both $u$ and $v$ to the cover, remove all edges incident to them. The algorithm selects at most $2|M|$ vertices for a maximal matching $M$. Any vertex cover must include at least one endpoint of each matching edge, so OPT $\geq |M|$. Thus ALG $\leq 2\cdot$OPT.
4. BPP: problems solvable with bounded error by a randomized poly-time algorithm. Believed BPP = P because: (1) derandomization via pseudorandom generators (if one-way functions exist, BPP = P), (2) no natural problem known in BPP\P, (3) practical algorithms initially in BPP (e.g., primality testing) have been derandomized (AKS). Impagliazzo-Wigderson: if $E$ requires exponential circuits, then BPP = P.