GadaaLabs
Python Mastery — From Zero to AI Engineering
Lesson 3

Data Structures — Lists, Tuples, Dicts, Sets & Comprehensions

35 min

How Python Stores Collections

Every time you store a group of values, you make a performance and semantic decision. Python's four built-in containers — list, tuple, dict, and set — each have distinct internal implementations, time complexities, and intended purposes.

Choosing the wrong container does not just slow your code — it obscures your intent and leads to bugs. This lesson gives you a complete mental model of each.


Lists — Dynamic Arrays of References

A list is a mutable, ordered, dynamically resized array of object references. CPython stores an array of pointers to PyObject structs. This means:

  • Index access → O(1) (direct pointer arithmetic)
  • Append → O(1) amortised (over-allocates to avoid frequent resizing)
  • Insert at front → O(n) (all pointers shift)
  • Search (in) → O(n) (linear scan)

Creating and Indexing

List creation, indexing and slicing
Click Run to execute — Python runs in your browser via WebAssembly

Every Mutation Method — Explained and Demonstrated

All list mutation methods
Click Run to execute — Python runs in your browser via WebAssembly

Sorting — Deep Dive with key=

Python uses Timsort — O(n log n) worst case, O(n) on already-sorted data, and stable (equal elements keep their original relative order).

Sorting: sort(), sorted(), key=
Click Run to execute — Python runs in your browser via WebAssembly

List as Stack and Queue — When to Use Each

Stack (list) vs Queue (deque)
Click Run to execute — Python runs in your browser via WebAssembly

Shallow vs Deep Copy

Shallow vs deep copy — a critical distinction
Click Run to execute — Python runs in your browser via WebAssembly

Tuples — Immutable, Hashable Sequences

A tuple is an immutable, ordered sequence. Because it cannot change, Python stores it more compactly than a list and it can be used as a dictionary key or set element.

Tuples: immutability, unpacking, keys
Click Run to execute — Python runs in your browser via WebAssembly

Named Tuples — Self-Documenting Records

Named tuples: collections.namedtuple and NamedTuple
Click Run to execute — Python runs in your browser via WebAssembly

Dictionaries — Hash Tables with Ordering

A dict is an ordered (Python 3.7+), mutable hash table. Keys are hashed to find a bucket — O(1) average for get/set/delete.

Key requirements: must be hashable — immutable types work (int, str, float, tuple of hashables, frozenset). Lists and dicts cannot be keys.

All Dictionary Methods

Dictionary: all methods and patterns
Click Run to execute — Python runs in your browser via WebAssembly
defaultdict and Counter — the most-used collections
Click Run to execute — Python runs in your browser via WebAssembly

Dictionary Comprehensions

Dictionary comprehensions — from simple to practical
Click Run to execute — Python runs in your browser via WebAssembly

Sets — O(1) Membership Testing

A set is a mutable, unordered collection of unique hashable objects — a hash table with values but no keys. The primary reason to use a set over a list is O(1) membership testing.

Set basics: add, discard, remove, membership
Click Run to execute — Python runs in your browser via WebAssembly

Set Mathematics — Real Analytics

Set algebra: union, intersection, difference, analytics
Click Run to execute — Python runs in your browser via WebAssembly

Comprehensions — Every Type Explained

All comprehension types: list, dict, set, generator
Click Run to execute — Python runs in your browser via WebAssembly

When to Use Each Data Structure

| Need | Best choice | Why | |------|-------------|-----| | Ordered sequence, random access | list | O(1) index | | Queue (FIFO) | deque | O(1) popleft | | Stack (LIFO) | list | O(1) pop | | Key-value store | dict | O(1) lookup | | Membership testing (repeated) | set | O(1) in | | Immutable record / dict key | tuple | hashable | | Frequency counting | Counter | built-in arithmetic | | Grouping without "if key in" | defaultdict | auto-initialise |


Project: Contact Management System

PROJECT: Contact Management System
Click Run to execute — Python runs in your browser via WebAssembly

Exercises

Exercise 1 — Anagram Detector

Exercise 1: Anagram Detector
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 2 — Two-Sum (O(n) with dict)

Exercise 2: Two-Sum O(n) — dict pattern
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 3 — LRU Cache

Exercise 3: LRU Cache with OrderedDict
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 4 — BFS Shortest Path (Graph as dict of sets)

Exercise 4: BFS Shortest Path
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 5 — Word Frequency Pipeline

Exercise 5: Word Frequency Pipeline
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 6 — Social Network with Set Operations

Exercise 6: Social Network (sets + dicts)
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 7 — Sliding Window Maximum

Exercise 7: Sliding Window Maximum (deque O(n))
Click Run to execute — Python runs in your browser via WebAssembly

Exercise 8 — Sparse Matrix as dict of dicts

Exercise 8: Sparse Matrix as dict-of-dicts
Click Run to execute — Python runs in your browser via WebAssembly

Key Takeaways

  • Lists are O(1) at the tail and by index — use deque for O(1) operations at both ends
  • Tuples signal immutability and are hashable — use them as dict keys and to represent fixed records
  • Dict lookups, inserts, and deletes are O(1) average — always prefer a dict over a list for key-based access
  • Sets give O(1) membership testing — use them whenever you repeatedly ask "is X in collection?"
  • defaultdict eliminates "if key not in dict" boilerplate for grouping, counting, and graph adjacency
  • Counter supports frequency counting, arithmetic on counts, and most_common(n)
  • Comprehensions are faster than for-loops for building collections; they express intent clearly
  • Generator expressions produce values lazily — prefer them when you need to iterate only once
  • The key= parameter in sorted() makes multi-key sorting trivial with no auxiliary data structures
  • Choose your container by asking: "What operation will I do most often?" — then pick the O(1) one