Python Mastery — From Zero to AI Engineering

Lesson 9

NumPy — Arrays, Broadcasting, Indexing & Linear Algebra

38 min

Why NumPy Exists — The Performance Problem

Python lists store pointers to arbitrary Python objects. Every arithmetic operation involves Python object overhead, type checking, and memory indirection. For numerical computing, this is devastating.

NumPy's ndarray stores values in a contiguous block of typed memory — exactly like a C array. Operations are dispatched to compiled C/Fortran routines that process entire arrays with no Python overhead per element.

NumPy vs Python loop: the performance case

Click Run to execute — Python runs in your browser via WebAssembly

The ndarray — Internals

An ndarray has four key attributes that define its structure:

dtype — the type of each element (int32, float64, bool, etc.)
shape — a tuple of dimension sizes, e.g. (3, 4) for a 3×4 matrix
strides — bytes to step in each dimension; enables reshaping without copying
data — pointer to the raw memory buffer

ndarray internals: dtype, shape, strides

Click Run to execute — Python runs in your browser via WebAssembly

Creating Arrays — Every Method

Array creation: array(), zeros, ones, arange, linspace

Click Run to execute — Python runs in your browser via WebAssembly

Data Types — Complete Reference

NumPy dtypes: sizes, ranges, casting

Click Run to execute — Python runs in your browser via WebAssembly

Indexing and Slicing — Every Pattern

Basic and 2D indexing: views vs copies

Click Run to execute — Python runs in your browser via WebAssembly

Boolean indexing, np.where, fancy indexing

Click Run to execute — Python runs in your browser via WebAssembly

Universal Functions (ufuncs) and Aggregations

Universal functions and axis-based aggregations

Click Run to execute — Python runs in your browser via WebAssembly

Broadcasting — The Most Powerful Feature

Broadcasting lets NumPy work with arrays of different shapes without copying data. The rules:

If arrays have different ndim, prepend 1s to the shorter shape
Dimensions of size 1 are stretched to match the other
If shapes don't match after stretching → error

Broadcasting: rules, normalisation, pairwise distances

Click Run to execute — Python runs in your browser via WebAssembly

Shape Manipulation

Reshape, transpose, stack, split

Click Run to execute — Python runs in your browser via WebAssembly

Random Number Generation

Random number generation with default_rng

Click Run to execute — Python runs in your browser via WebAssembly

Linear Algebra

Linear algebra: matrix multiply, solve, eigen, SVD

Click Run to execute — Python runs in your browser via WebAssembly

Performance: Vectorization Patterns

Vectorization speedups: loops vs NumPy

Click Run to execute — Python runs in your browser via WebAssembly

Project: Gradient Descent from Scratch

Implement linear regression and gradient descent using only NumPy — no scikit-learn.

PROJECT: Linear Regression via Gradient Descent (pure NumPy)

Click Run to execute — Python runs in your browser via WebAssembly

Exercises

Exercise 1 — Array Creation and Inspection

Exercise 1: Array creation challenges

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 2 — Indexing and Masking

Exercise 2: Indexing, masking, np.where

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 3 — Broadcasting and Vectorization

Exercise 3: Broadcasting challenges

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 4 — Statistics from Scratch

Exercise 4: Statistical functions from scratch

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 5 — Image Processing with Arrays

Exercise 5: Image operations with NumPy arrays

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 6 — PCA from Scratch

Exercise 6: PCA from scratch (SVD)

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 7 — Moving Averages and Time Series

Exercise 7: Moving averages and Bollinger Bands

Click Run to execute — Python runs in your browser via WebAssembly

Exercise 8 — Neural Network Forward Pass

Exercise 8: Neural Network Forward Pass (pure NumPy)

Click Run to execute — Python runs in your browser via WebAssembly

Key Takeaways

NumPy stores data in contiguous typed memory — this is why it beats Python loops by 10-100x for numerical work
strides let NumPy reshape, transpose, and slice arrays without copying data — always check if you have a view or a copy
Avoid Python loops over array elements — every operation has a vectorized NumPy equivalent
axis=0 reduces along rows (gives column results); axis=1 reduces along columns (gives row results); always add keepdims=True when the result must broadcast back
Broadcasting follows three rules: prepend 1s, stretch size-1 dims, error on incompatible dims — internalize these to write clean vectorized code
Boolean indexing returns a copy; slicing returns a view — modifying a slice modifies the original
np.where(cond, a, b) is the vectorized ternary operator — use it instead of masked assignment when creating a new array
@ is matrix multiply; * is element-wise — never confuse them
np.linalg.solve(A, b) is faster and more numerically stable than inv(A) @ b
Use dtype=np.float32 instead of float64 for ML training — half the memory, same accuracy for most tasks

Advanced Python — Decorators, Generators & Type Hints Pandas — DataFrames, Cleaning & Analysis