Python Mastery — From Zero to AI Engineering
Lesson 9
NumPy — Arrays, Broadcasting, Indexing & Linear Algebra
38 min
Why NumPy Exists — The Performance Problem
Python lists store pointers to arbitrary Python objects. Every arithmetic operation involves Python object overhead, type checking, and memory indirection. For numerical computing, this is devastating.
NumPy's ndarray stores values in a contiguous block of typed memory — exactly like a C array. Operations are dispatched to compiled C/Fortran routines that process entire arrays with no Python overhead per element.
NumPy vs Python loop: the performance case
Click Run to execute — Python runs in your browser via WebAssembly
The ndarray — Internals
An ndarray has four key attributes that define its structure:
dtype— the type of each element (int32, float64, bool, etc.)shape— a tuple of dimension sizes, e.g.(3, 4)for a 3×4 matrixstrides— bytes to step in each dimension; enables reshaping without copyingdata— pointer to the raw memory buffer
ndarray internals: dtype, shape, strides
Click Run to execute — Python runs in your browser via WebAssembly
Creating Arrays — Every Method
Array creation: array(), zeros, ones, arange, linspace
Click Run to execute — Python runs in your browser via WebAssembly
Data Types — Complete Reference
NumPy dtypes: sizes, ranges, casting
Click Run to execute — Python runs in your browser via WebAssembly
Indexing and Slicing — Every Pattern
Basic and 2D indexing: views vs copies
Click Run to execute — Python runs in your browser via WebAssembly
Boolean indexing, np.where, fancy indexing
Click Run to execute — Python runs in your browser via WebAssembly
Universal Functions (ufuncs) and Aggregations
Universal functions and axis-based aggregations
Click Run to execute — Python runs in your browser via WebAssembly
Broadcasting — The Most Powerful Feature
Broadcasting lets NumPy work with arrays of different shapes without copying data. The rules:
- If arrays have different
ndim, prepend 1s to the shorter shape - Dimensions of size 1 are stretched to match the other
- If shapes don't match after stretching → error
Broadcasting: rules, normalisation, pairwise distances
Click Run to execute — Python runs in your browser via WebAssembly
Shape Manipulation
Reshape, transpose, stack, split
Click Run to execute — Python runs in your browser via WebAssembly
Random Number Generation
Random number generation with default_rng
Click Run to execute — Python runs in your browser via WebAssembly
Linear Algebra
Linear algebra: matrix multiply, solve, eigen, SVD
Click Run to execute — Python runs in your browser via WebAssembly
Performance: Vectorization Patterns
Vectorization speedups: loops vs NumPy
Click Run to execute — Python runs in your browser via WebAssembly
Project: Gradient Descent from Scratch
Implement linear regression and gradient descent using only NumPy — no scikit-learn.
PROJECT: Linear Regression via Gradient Descent (pure NumPy)
Click Run to execute — Python runs in your browser via WebAssembly
Exercises
Exercise 1 — Array Creation and Inspection
Exercise 1: Array creation challenges
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 2 — Indexing and Masking
Exercise 2: Indexing, masking, np.where
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 3 — Broadcasting and Vectorization
Exercise 3: Broadcasting challenges
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 4 — Statistics from Scratch
Exercise 4: Statistical functions from scratch
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 5 — Image Processing with Arrays
Exercise 5: Image operations with NumPy arrays
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 6 — PCA from Scratch
Exercise 6: PCA from scratch (SVD)
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 7 — Moving Averages and Time Series
Exercise 7: Moving averages and Bollinger Bands
Click Run to execute — Python runs in your browser via WebAssembly
Exercise 8 — Neural Network Forward Pass
Exercise 8: Neural Network Forward Pass (pure NumPy)
Click Run to execute — Python runs in your browser via WebAssembly
Key Takeaways
- NumPy stores data in contiguous typed memory — this is why it beats Python loops by 10-100x for numerical work
strideslet NumPy reshape, transpose, and slice arrays without copying data — always check if you have a view or a copy- Avoid Python loops over array elements — every operation has a vectorized NumPy equivalent
axis=0reduces along rows (gives column results);axis=1reduces along columns (gives row results); always addkeepdims=Truewhen the result must broadcast back- Broadcasting follows three rules: prepend 1s, stretch size-1 dims, error on incompatible dims — internalize these to write clean vectorized code
- Boolean indexing returns a copy; slicing returns a view — modifying a slice modifies the original
np.where(cond, a, b)is the vectorized ternary operator — use it instead of masked assignment when creating a new array@is matrix multiply;*is element-wise — never confuse themnp.linalg.solve(A, b)is faster and more numerically stable thaninv(A) @ b- Use
dtype=np.float32instead of float64 for ML training — half the memory, same accuracy for most tasks