Chapter Seventeen

Decorators, Generators, and Iterators

Learning Objectives
  1. Implement the iterator protocol by defining __iter__ and __next__ on a custom class
  2. Write generator functions using yield and explain how lazy evaluation conserves memory
  3. Compare generator expressions with list comprehensions and choose the appropriate tool
  4. Create decorators that wrap functions and preserve metadata with functools.wraps
  5. Apply built-in decorators including @property, @staticmethod, @classmethod, @lru_cache, and @dataclass

Some of the most powerful features in Python look, at first glance, like magic. A for loop that processes a billion-row file without running out of memory. A single line above a function definition that adds caching, logging, or access control. A with statement that guarantees cleanup even when exceptions strike. These are not magic — they are iterators, generators, and decorators, three mechanisms that sit at the heart of idiomatic Python. Understanding them is the difference between writing code that merely works and writing code that is elegant, efficient, and Pythonic.

The Iterator Protocol

Every time you write a for loop in Python, you are using the iterator protocol without thinking about it. When Python encounters for item in something, it calls iter(something) to get an iterator — an object that knows how to produce items one at a time — and then calls next() on that iterator repeatedly until it raises StopIteration.

numbers = [10, 20, 30]
it = iter(numbers)

print(next(it))    # Output: 10
print(next(it))    # Output: 20
print(next(it))    # Output: 30
# next(it) would raise StopIteration

Lists, tuples, strings, dictionaries, sets, and files are all iterable — they implement __iter__ and return an iterator. The iterator itself implements __next__ to produce values. This two-method contract is the iterator protocol, and anything that satisfies it can be used in a for loop.

Building a Custom Iterator

You can make any class iterable by defining __iter__ and __next__. Here is a Countdown class that counts from a given number down to one:

class Countdown:
    def __init__(self, start):
        self.current = start

    def __iter__(self):
        return self

    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        value = self.current
        self.current -= 1
        return value

for n in Countdown(5):
    print(n)
# Output: 5 4 3 2 1

__iter__ returns the iterator object — in this case, self, because the class is both the iterable and the iterator. __next__ produces the next value or raises StopIteration to signal the end. This works, but it is a lot of boilerplate for a simple sequence. Generators make it dramatically simpler.

Generators with yield

A generator function looks like a normal function but uses the yield keyword instead of (or alongside) return. When you call a generator function, Python does not execute the body immediately. Instead, it returns a generator object — a special kind of iterator that produces values lazily, one at a time, pausing execution at each yield and resuming when the next value is requested.

def countdown(start):
    while start > 0:
        yield start
        start -= 1

for n in countdown(5):
    print(n)
# Output: 5 4 3 2 1

The same behaviour as the class above, in four lines instead of twelve. Each time the loop calls next() on the generator, execution resumes from where it left off — right after the last yield — runs until the next yield, and pauses again. When the function returns (or falls off the end), StopIteration is raised automatically.

This is lazy evaluation: the generator does not compute all values up front. It produces each value only when asked. For a countdown from five, the difference is trivial. For a generator that reads a billion lines from a log file, it is the difference between using a few kilobytes of memory and using dozens of gigabytes.

def read_lines(filename):
    with open(filename) as f:
        for line in f:
            yield line.strip()

# Process a massive file without loading it all into memory
for line in read_lines("server.log"):
    if "ERROR" in line:
        print(line)

Generator Expressions

Just as list comprehensions provide a compact syntax for building lists, generator expressions provide a compact syntax for building generators. The syntax is identical except you use parentheses instead of square brackets:

# List comprehension — builds entire list in memory
squares_list = [x ** 2 for x in range(1_000_000)]

# Generator expression — produces values lazily
squares_gen = (x ** 2 for x in range(1_000_000))

The list comprehension allocates memory for a million integers immediately. The generator expression allocates almost nothing — it produces each square only when requested. When you need to iterate through results exactly once (passing them to sum(), max(), or another for loop), a generator expression is almost always the better choice:

total = sum(x ** 2 for x in range(1_000_000))
# No temporary list created — values flow directly into sum()

Note that when a generator expression is the only argument to a function, you can omit the extra pair of parentheses.

The itertools Module

The itertools module in the standard library provides a toolkit of fast, memory-efficient building blocks for working with iterators. All of its functions return iterators, so they compose naturally and never materialise large intermediate lists.

import itertools

# chain: concatenate multiple iterables
combined = itertools.chain([1, 2], [3, 4], [5, 6])
print(list(combined))    # Output: [1, 2, 3, 4, 5, 6]

# islice: slice an iterator (like list slicing, but lazy)
first_five = itertools.islice(range(100), 5)
print(list(first_five))  # Output: [0, 1, 2, 3, 4]

# groupby: group consecutive elements by a key
data = [("a", 1), ("a", 2), ("b", 3), ("b", 4)]
for key, group in itertools.groupby(data, key=lambda x: x[0]):
    print(key, list(group))
# Output: a [('a', 1), ('a', 2)]
#         b [('b', 3), ('b', 4)]

# product: Cartesian product
for pair in itertools.product("AB", [1, 2]):
    print(pair)
# Output: ('A', 1) ('A', 2) ('B', 1) ('B', 2)

A common pattern is chaining itertools functions together — islice(chain(...), 100) to take the first 100 items from a concatenation of iterables, for example. Think of itertools as the standard library's answer to "I need to process data streams without loading everything into memory."

Decorators: Functions That Wrap Functions

A decorator is a function that takes a function as input and returns a modified version of it. The @decorator syntax, placed above a function definition, is syntactic sugar that makes the wrapping explicit and readable.

Here is the simplest possible decorator — one that prints a message before and after the wrapped function runs:

def log_calls(func):
    def wrapper(*args, **kwargs):
        print(f"Calling {func.__name__}")
        result = func(*args, **kwargs)
        print(f"{func.__name__} returned {result}")
        return result
    return wrapper

@log_calls
def add(a, b):
    return a + b

add(3, 4)
# Output: Calling add
#         add returned 7

The @log_calls line is exactly equivalent to writing add = log_calls(add) after the function definition. The decorator replaces the original function with wrapper, which calls the original function inside it. The *args and **kwargs ensure the wrapper accepts any arguments the original function expects.

functools.wraps

There is a subtle problem with the decorator above. After decoration, add.__name__ returns "wrapper", not "add", because the function has been replaced. This breaks introspection, documentation, and debugging. The fix is functools.wraps, a decorator for your wrapper:

import functools

def log_calls(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        print(f"Calling {func.__name__}")
        result = func(*args, **kwargs)
        print(f"{func.__name__} returned {result}")
        return result
    return wrapper

functools.wraps copies the original function's name, docstring, and other metadata onto the wrapper. Always use it. It costs nothing and prevents mysterious debugging sessions later.

Common Built-in Decorators

Python ships with several decorators that you will use regularly.

@property turns a method into a read-only attribute, as we saw in the classes chapter. @staticmethod defines a method that does not receive self or cls. @classmethod defines a method that receives the class as its first argument, often used for alternative constructors.

Two more deserve attention here. @lru_cache from functools adds memoisation — it caches the results of expensive function calls so that repeated calls with the same arguments return instantly:

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(100))    # Output: 354224848179261915075
# Without caching, this would take astronomical time

@dataclass from the dataclasses module automatically generates __init__, __repr__, __eq__, and other methods for a class based on its annotated fields:

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

p = Point(3.0, 4.0)
print(p)              # Output: Point(x=3.0, y=4.0)
print(p == Point(3.0, 4.0))  # Output: True

Stacking Decorators

You can apply multiple decorators to a single function. They are applied bottom-up — the decorator closest to the function runs first:

@decorator_a
@decorator_b
def my_function():
    pass

# Equivalent to: my_function = decorator_a(decorator_b(my_function))

This is common in web frameworks, where you might stack authentication, rate-limiting, and logging decorators on a single endpoint. The order matters: if decorator_b adds authentication and decorator_a adds logging, the log will capture unauthenticated requests that are rejected. Reverse them and you only log authenticated ones.

Context Managers with @contextmanager

Every time you write with open("file.txt") as f:, you are using a context manager — an object that sets something up when you enter the with block and tears it down when you leave, even if an exception occurs. You can create your own using the @contextmanager decorator from contextlib:

from contextlib import contextmanager
import time

@contextmanager
def timer(label):
    start = time.time()
    yield
    elapsed = time.time() - start
    print(f"{label}: {elapsed:.3f} seconds")

with timer("Processing"):
    total = sum(range(10_000_000))
# Output: Processing: 0.312 seconds

The yield statement marks the boundary between setup and teardown. Everything before yield runs when entering the with block; everything after runs when leaving it. If you need to pass a value into the block, yield it: yield connection makes the value available as the as variable.

This pattern eliminates an enormous class of resource-management bugs — forgotten file closes, unreleased locks, unclosed database connections — and is one of the features that makes Python code so much more robust than it first appears.

Decorators, generators, and iterators are the connective tissue of advanced Python. Generators let you work with data streams of any size without drowning in memory. Decorators let you separate cross-cutting concerns — logging, caching, validation — from business logic. And the iterator protocol ties it all together, giving Python's for loop the ability to traverse anything. Master these three concepts and you will find that the most elegant solutions to complex problems are often shorter, not longer, than the brute-force alternatives.