Histograms & Distributions
Histograms & Distributions
Divide the range of data into equal-width bins. The height (or area) of each bar represents the frequency or relative frequency of values in that bin.
- Sturges' rule: $k = \lceil 1 + \log_2 n \rceil$
- Square-root rule: $k = \lceil \sqrt{n} \rceil$
$k$ = number of bins, $n$ = number of data points.
- Symmetric: mean ≈ median, mirror image about center
- Right-skewed: long tail to the right, mean > median
- Left-skewed: long tail to the left, mean < median
- Bimodal: two peaks
A smooth curve where the total area under it equals 1. The area over an interval gives the proportion of data in that range.
Data: 12, 15, 17, 19, 21, 22, 25, 28, 30, 35. Use 5 bins of width 5 starting at 10.
Bins: [10,15): 2, [15,20): 2, [20,25): 2, [25,30): 2, [30,35]: 2. Uniform distribution.
A histogram has most bars on the left with a long tail to the right. Describe the skew.
Right-skewed (positively skewed). Mean > median.
$n = 100$ data points. How many bins by Sturges' rule?
$k = \lceil 1 + \log_2 100 \rceil = \lceil 1 + 6.64 \rceil = 8$.
Practice Problems
Show Answer Key
1. The vertical axis (y-axis)
2. $\lceil 1+\log_2 50 \rceil = \lceil 6.64 \rceil = 7$
3. $\lceil \sqrt{64} \rceil = 8$
4. Bimodal
5. Right-skewed (few very high earners)
6. 1 (100%)
7. Detail is lost; distribution looks flat
8. Too noisy; bars are ragged
9. No — only if a bin truly has zero frequency (empty bin)
10. Proportion (relative frequency) in that bin
11. Preserves individual data values
12. Symmetric, bell-shaped