9 / 10

Python Scientific Stack — NumPy, Matplotlib & Pandas

24 min Data Visualization

Python Scientific Stack — NumPy, Matplotlib & Pandas

The Python Scientific Ecosystem

Python's open-source libraries form a powerful toolkit for scientific computing and visualization:

NumPy — N-dimensional arrays and vectorized math; the foundation for all scientific Python.
Matplotlib — comprehensive 2-D and 3-D plotting library (modeled after MATLAB's plotting API).
Pandas — tabular data manipulation with DataFrame and Series objects.
SciPy — integration, optimization, interpolation, signal processing, linear algebra.
Seaborn — statistical visualization built on Matplotlib with intelligent defaults.
Plotly — interactive, web-based charts.

NumPy — Key Concepts

ndarray — the N-dimensional array. Created with np.array([1,2,3]).
Shape — a.shape returns dimensions, e.g. (3,) for a vector, (2,3) for a 2×3 matrix.
arange — np.arange(start, stop, step) — evenly spaced values (excludes stop).
linspace — np.linspace(a, b, n) — $n$ points from $a$ to $b$ inclusive.
Broadcasting — NumPy automatically aligns shapes for element-wise operations.
Vectorized ops — np.sin(x), x**2, x * y all operate element-wise without loops.

Matplotlib — Key Functions

Function	Purpose
`plt.plot(x, y)`	Line plot
`plt.scatter(x, y)`	Scatter plot
`plt.bar(x, heights)`	Bar chart
`plt.hist(data, bins)`	Histogram
`plt.xlabel()`, `plt.ylabel()`, `plt.title()`	Labels
`plt.subplot(r, c, i)`	Subplots in a grid
`plt.savefig('file.png', dpi=300)`	Save to file
`plt.show()`	Display figure

Pandas — Key Concepts

Series — a labeled 1-D array. s = pd.Series([10, 20, 30], index=['a','b','c'])
DataFrame — a labeled 2-D table. df = pd.DataFrame({'x': [1,2,3], 'y': [4,5,6]})
Read data — pd.read_csv('file.csv')
Descriptive stats — df.describe() returns count, mean, std, min, quartiles, max.
Grouping — df.groupby('category')['value'].mean()
Plotting shortcut — df.plot(x='col1', y='col2', kind='scatter')

Example 1 — Plot a Function

Plot $y = x^2 e^{-x}$ for $0 \le x \le 8$ with labeled axes.

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 8, 300)
y = x**2 * np.exp(-x)
plt.plot(x, y, 'r-', linewidth=2)
plt.xlabel('x'); plt.ylabel('y'); plt.title(r'$y = x^2 e^{-x}$')
plt.grid(True); plt.show()
Variables: x — 300-point array $[0, 8]$; y — element-wise $x^2 e^{-x}$; 'r-' — red solid line.

Example 2 — Pandas Descriptive Stats

Load a CSV of exam scores and compute summary statistics.

import pandas as pd
df = pd.read_csv('scores.csv')
print(df.describe())
Output: count, mean ($\bar{x}$), std ($s$), min, 25%, 50% (median), 75%, max for each numeric column.
Variables: df — DataFrame; describe() — computes $n, \bar{x}, s, Q_1, \tilde{x}, Q_3$.

Example 3 — Subplots

Show $\sin(x)$ and $\cos(x)$ side by side.

fig, (ax1, ax2) = plt.

subplots(1, 2, figsize=(10,4))
x = np.linspace(0, 2*np.pi, 200)
ax1.plot(x, np.sin(x), 'b'); ax1.set_title('sin(x)')
ax2.plot(x, np.cos(x), 'r'); ax2.set_title('cos(x)')
plt.tight_layout(); plt.show()
Variables: fig — figure container; ax1, ax2 — subplot axes; figsize — width, height in inches.

Practice Problems

1. What does np.linspace(0, 1, 50) return?

2. How is np.arange different from np.linspace?

3. What does vectorized operation mean?

4. How do you create a 2×3 matrix in NumPy?

5. What command saves a Matplotlib figure to a PNG file?

6. What is a Pandas DataFrame?

7. How do you read a CSV file in Pandas?

8. What does df.describe() return?

9. How do you create two subplots side by side?

10. What library provides geom_point-style plotting in Python?

11. What is broadcasting in NumPy?

12. Name two Python libraries for interactive charts.

Show Answer Key

1. A 1-D array of 50 evenly spaced values from 0 to 1 (inclusive)

2. arange uses a step size and excludes the endpoint; linspace uses a count of points and includes the endpoint

3. Operations applied element-wise to entire arrays without explicit Python loops

4. np.array([[1,2,3],[4,5,6]]) or np.zeros((2,3))

5. plt.savefig('file.png', dpi=300)

6. A 2-D labeled tabular data structure with named columns and an index

7. pd.read_csv('filename.csv')

8. Count, mean, std, min, 25th/50th/75th percentiles, max for each numeric column

9. fig, (ax1, ax2) = plt.subplots(1, 2)

10. Seaborn (or Plotly)

11. NumPy automatically aligns arrays of different shapes for element-wise operations

12. Plotly and Bokeh (also: Altair, Dash)

LabVIEW — Graphical Data Acquisition & Display Placement Test Practice — Data Visualization

Python Scientific Stack — NumPy, Matplotlib & Pandas

Python Scientific Stack — NumPy, Matplotlib & Pandas

Practice Problems

Graphing Calculator

Statistics Calculator

Add Custom Constant

My Notes

Highlights