Training Cryptography Hash Functions & Digital Signatures
6 / 7

Hash Functions & Digital Signatures

30 min Cryptography

Hash Functions & Digital Signatures

Cryptographic hash functions map arbitrary-length input to a fixed-length digest with collision resistance, preimage resistance, and second-preimage resistance. They underpin digital signatures, password hashing, and blockchain.

Definition

Hash security properties: preimage resistance (given \(h\), find \(m\) with \(H(m)=h\) — hard in \(O(2^n)\)), second-preimage resistance (given \(m\), find \(m'\ne m\) with same hash — \(O(2^n)\)), collision resistance (find any \(m\ne m'\) with \(H(m)=H(m')\) — birthday bound \(O(2^{n/2})\)).

Key Result

SHA-256 produces 256-bit digests using a Merkle-Damgård construction with a Davies-Meyer compression function. The birthday bound means collisions are expected after \(2^{128}\) hashes — infeasible with current technology.

Example 1

RSA-PSS signature: hash the message \(m\) to get \(H(m)\), apply probabilistic padding to get \(M\), compute \(\sigma = M^d\bmod n\). Verification: \(M' = \sigma^e\bmod n\), verify padding and hash. PSS is provably secure in the random oracle model.

Example 2

Length extension attack: Merkle-Damgård hashes (MD5, SHA-1, SHA-256) are vulnerable. An attacker knowing \(H(k\|m)\) can compute \(H(k\|m\|pad\|m')\) without knowing \(k\). Mitigation: HMAC (nested construction) or SHA-3 (sponge construction).

Loading hash-collision-viz...

Practice

  1. What is the difference between a MAC and a digital signature?
  2. Explain why MD5 and SHA-1 are deprecated for digital signatures.
  3. How does a Merkle tree enable efficient proof of inclusion?
  4. Describe the structure of HMAC and why it is secure against length extension.
Show Answer Key

1. MAC (Message Authentication Code): symmetric key — both parties share the same secret key. Provides authentication and integrity, but not non-repudiation (either party could have generated the MAC). Digital signature: asymmetric — signer uses private key, verifier uses public key. Provides authentication, integrity, and non-repudiation (only the private key holder could sign). Signatures are more expensive computationally.

2. MD5 (2004) and SHA-1 (2017): practical collision attacks found. MD5: collisions in seconds. SHA-1: first collision by Google/CWI in 2017 ($2^{63}$ computations). For digital signatures, collision resistance is essential (attacker could create two documents with the same hash, get one signed, substitute the other). Current standards require SHA-256 or SHA-3.

3. Merkle tree: hash each data block (leaves), then recursively hash pairs of hashes up to a root. Proof of inclusion for leaf $x$: provide the sibling hashes along the path from $x$ to the root ($O(\log n)$ hashes). Verifier recomputes the path and checks against the root. Used in: Git, Bitcoin (transaction verification), certificate transparency logs.

4. HMAC: $\text{HMAC}(k,m) = H((k\oplus\text{opad})\|H((k\oplus\text{ipad})\|m))$ where opad/ipad are fixed constants. The nested structure prevents length-extension attacks (which affect plain $H(k\|m)$: an attacker can compute $H(k\|m\|m')$ without knowing $k$, because Merkle-Damgård hashes process blocks sequentially). HMAC's outer hash re-hashes with the key, making length extension impossible. Security proof: HMAC is a PRF if the compression function is a PRF.