GadaaLabs
Python Mastery — From Zero to AI Engineering
Lesson 3

Data Structures — Lists, Dicts, Sets & Comprehensions

28 min

Choosing the Right Container

Every time you store a collection of values, you are making a performance and semantic decision. Python's four built-in containers — list, tuple, dict, and set — each have distinct time complexities, mutability characteristics, and intended use cases. Choosing the wrong one does not just slow your code; it obscures your intent.

This lesson covers each container deeply: how it works internally, when to use it, and the comprehension syntax that makes Python code read like well-written prose.

Lists

A list is a mutable, ordered, dynamically resized array of references. Under the hood, CPython stores a contiguous array of pointers to PyObject structs. This means random access by index is O(1) but inserting at the front is O(n) — the entire array shifts.

List indexing and slicing
Click Run to execute — Python runs in your browser via WebAssembly

Mutation Methods

List mutation methods
Click Run to execute — Python runs in your browser via WebAssembly

Sorting with key=

The key parameter is one of Python's most powerful features. It lets you sort by any function of the element without transforming the list itself.

Sorting with key=
Click Run to execute — Python runs in your browser via WebAssembly

List as Stack and Queue

List as stack, deque as queue
Click Run to execute — Python runs in your browser via WebAssembly

Big-O Complexity of List Operations

| Operation | Time | Notes | |-----------|------|-------| | lst[i] | O(1) | Random access | | len(lst) | O(1) | Cached | | lst.append(x) | O(1) amortized | Occasional resize | | lst.pop() | O(1) | From the end | | lst.pop(0) | O(n) | From the front — use deque | | lst.insert(i, x) | O(n) | Shifts elements | | x in lst | O(n) | Linear scan | | lst.sort() | O(n log n) | Timsort, stable | | lst[a:b] | O(k) | k = b - a |

Tuples

A tuple is an immutable, ordered sequence. Once created, you cannot change its elements. This makes tuples suitable for fixed collections: coordinates, RGB values, database rows, function return values, dict keys.

Tuples: immutability, unpacking, namedtuple
Click Run to execute — Python runs in your browser via WebAssembly

Dictionaries

A dictionary is a hash table — a mutable mapping from keys to values. Key lookup, insertion, and deletion are all O(1) average. Since Python 3.7, dicts preserve insertion order.

For a key to be used in a dict, it must be hashable (immutable and implements __hash__). Strings, numbers, and tuples of hashables work. Lists and dicts do not.

Dictionaries: creation, access, mutation
Click Run to execute — Python runs in your browser via WebAssembly

defaultdict and Counter

defaultdict and Counter
Click Run to execute — Python runs in your browser via WebAssembly

Sets

A set is an unordered collection of unique hashable elements, implemented as a hash table without values. Membership testing (in) is O(1) — far faster than a list's O(n).

Sets: operations and use cases
Click Run to execute — Python runs in your browser via WebAssembly

Comprehensions

Comprehensions are syntactic sugar for creating collections by transforming or filtering an iterable. They are faster than equivalent for loops (the loop runs in C), and — when kept simple — far more readable.

List Comprehensions

List comprehensions
Click Run to execute — Python runs in your browser via WebAssembly

Dict and Set Comprehensions

Dict and set comprehensions
Click Run to execute — Python runs in your browser via WebAssembly

Generator Expressions

A generator expression looks like a list comprehension but uses parentheses. It returns a generator — a lazy iterator that produces one value at a time without storing the entire sequence in memory. For large datasets, this is the difference between crashing and running fine.

Generator expressions vs list comprehensions
Click Run to execute — Python runs in your browser via WebAssembly

When to Use Each Data Structure

| Structure | Ordered | Mutable | Duplicates | Key Lookup | Best For | |-----------|---------|---------|-----------|------------|----------| | list | Yes | Yes | Yes | O(n) | Ordered sequences, stacks | | tuple | Yes | No | Yes | O(n) | Fixed records, dict keys | | dict | Yes (3.7+) | Yes | Keys: No | O(1) | Key-value mapping | | set | No | Yes | No | O(1) | Deduplication, membership | | frozenset | No | No | No | O(1) | Immutable sets, dict keys | | deque | Yes | Yes | Yes | O(n) | FIFO queues, sliding windows | | Counter | No | Yes | — | O(1) | Frequency counting |

Decision guide:

  • Need fast lookup by a key? → dict
  • Need to check "is X in collection" frequently? → set
  • Need ordered, indexable, and will mutate? → list
  • Returning multiple values from a function? → tuple
  • Processing a huge stream without loading into memory? → generator

PROJECT: Contact Management System

A full CLI-style contact book demonstrating dicts, lists, sets, and comprehensions working together.

PROJECT: Contact Management System
Click Run to execute — Python runs in your browser via WebAssembly

Challenge

Push your skills further with these exercises:

  1. LRU Cache — implement an LRUCache(capacity) class using a dict and collections.deque. It should support get(key) and put(key, value) in O(1) time, evicting the least recently used item when at capacity.

  2. Frequency analysis — given a list of strings, use a defaultdict(Counter) to build a map of first_letter -> Counter of words. Find which starting letter has the most lexical diversity.

  3. Set operations for analytics — you have two sets of user IDs: users_week1 and users_week2. Compute: retained users (in both), churned users (only in week1), new users (only in week2), and the retention rate.

  4. Comprehension golf — write these as single-line comprehensions:

    • All prime numbers below 100
    • A dict mapping each word in a sentence to its reverse
    • The transpose of a 4x4 matrix
  5. Contact book extension — add a merge(other_book) method that merges two ContactBooks, handling duplicates by keeping the one with more fields filled in.

Key Takeaways

  • Lists are O(1) for indexed access and tail operations, but O(n) for arbitrary insertion/deletion — use deque for queue operations
  • Tuples signal immutability and are hashable — they work as dict keys and can represent fixed records
  • Dict lookups are O(1) average thanks to hashing — for frequent "is X in collection" checks, always prefer a set or dict over a list
  • defaultdict and Counter solve 80% of grouping and counting tasks without boilerplate
  • Comprehensions are faster than for loops for building collections and express intent more clearly
  • Generator expressions produce values lazily — always prefer them over list comprehensions when you only need to iterate once
  • The key= parameter in sorted() is one of Python's most powerful features — it avoids DSU (Decorate-Sort-Undecorate) patterns
  • Choose your container by its O-complexity and semantics: the right choice is often obvious once you ask "what operations will I do most?"