Testing — Textbook of Python

Learning Objectives

Write unit tests using both unittest.TestCase and pytest, and explain why pytest has become the modern default
Apply the Arrange-Act-Assert pattern to structure clear, maintainable tests
Use fixtures and parametrize in pytest to reduce duplication across test cases
Mock external dependencies using unittest.mock to isolate the code under test
Evaluate code coverage reports critically and explain why 100% coverage does not guarantee correctness

Untested code is broken code — you just do not know where yet. Every programmer has had the experience of making a small, innocent change that silently broke something three modules away. Tests are the only reliable defence against this. They are not bureaucratic overhead imposed by cautious managers; they are the thing that lets you change your code with confidence. Without tests, every refactor is a gamble. With them, it is an experiment with a known outcome.

Why Testing Matters

A test is a piece of code that runs another piece of code and checks that it behaves correctly. The simplest possible test is just an assertion:

def add(a, b):
    return a + b

assert add(2, 3) == 5
assert add(-1, 1) == 0
assert add(0, 0) == 0

If any assertion fails, Python raises an AssertionError. This is testing at its most primitive — and it already has value. But raw assert statements have limitations: they stop at the first failure, they produce unhelpful error messages, and they give you no way to organise, discover, or selectively run tests. That is where testing frameworks come in.

Testing matters for three reasons. First, tests catch bugs before your users do. Second, tests serve as executable documentation — they show exactly how your code is supposed to behave. Third, and most importantly, tests give you the freedom to refactor. Code without tests is code you are afraid to change.

unittest: The Built-in Framework

Python ships with unittest, a testing framework inspired by Java's JUnit. You write tests as methods on a class that inherits from unittest.TestCase:

import unittest

def multiply(a, b):
    return a * b

class TestMultiply(unittest.TestCase):
    def test_positive_numbers(self):
        self.assertEqual(multiply(3, 4), 12)

    def test_zero(self):
        self.assertEqual(multiply(5, 0), 0)

    def test_negative(self):
        self.assertEqual(multiply(-2, 3), -6)

if __name__ == "__main__":
    unittest.main()

TestCase provides dozens of assertion methods: assertEqual, assertTrue, assertRaises, assertIn, assertAlmostEqual, and more. Each produces a clear error message when it fails. You run tests with python -m unittest, which discovers and runs all test files matching the pattern test*.py.

unittest works and it is always available. But it is verbose — you must create classes, inherit from TestCase, and use self.assertSomething instead of plain assert. For new projects, most Python developers reach for something leaner.

pytest: The Modern Standard

pytest is a third-party testing framework that has become the de facto standard for Python testing. Install it with pip install pytest. The reason for its dominance is simple: it makes testing feel effortless.

# test_calculator.py
def add(a, b):
    return a + b

def test_add_positive():
    assert add(2, 3) == 5

def test_add_negative():
    assert add(-1, -1) == -2

def test_add_zero():
    assert add(0, 0) == 0

No classes. No inheritance. No special assertion methods. Just functions whose names start with test_ and plain assert statements. When an assertion fails, pytest introspects the expression and shows you exactly what went wrong — the left side, the right side, and the difference. Run with pytest from the command line and it discovers all test files automatically.

Fixtures let you share setup logic across tests without repeating yourself:

import pytest

@pytest.fixture
def sample_users():
    return [
        {"name": "Alice", "age": 30},
        {"name": "Bob", "age": 25},
    ]

def test_user_count(sample_users):
    assert len(sample_users) == 2

def test_first_user_name(sample_users):
    assert sample_users[0]["name"] == "Alice"

A fixture is a function decorated with @pytest.fixture. Any test that names it as a parameter receives the fixture's return value automatically. Fixtures can set up databases, create temporary files, or configure anything your tests need — and they can clean up afterwards using yield.

parametrize runs the same test logic with different inputs:

@pytest.mark.parametrize("a, b, expected", [
    (2, 3, 5),
    (-1, 1, 0),
    (0, 0, 0),
    (100, -100, 0),
])
def test_add(a, b, expected):
    assert add(a, b) == expected

One test function, four test cases. pytest reports each combination separately, so you know exactly which inputs failed.

Writing Good Tests

A well-structured test follows the Arrange-Act-Assert pattern:

def test_user_full_name():
    # Arrange — set up the data
    user = User(first="Ada", last="Lovelace")

    # Act — call the code under test
    result = user.full_name()

    # Assert — check the outcome
    assert result == "Ada Lovelace"

Arrange prepares the inputs and dependencies. Act calls the function or method being tested. Assert verifies the result. Each test should assert one concept — not necessarily one assert statement, but one logical idea. A test called test_user_creation might reasonably check both the name and the email, because both are part of the concept "the user was created correctly."

Name your tests descriptively. test_add is acceptable; test_add_returns_sum_of_two_positive_integers is better. When a test fails at 2am, the name is the first thing you read. Make it count.

Keep tests independent. Each test should set up its own state, run in isolation, and clean up after itself. Tests that depend on execution order are fragile and confusing. If test B fails only when test A runs first, you have a shared state bug, not a code bug.

Test Discovery

Both unittest and pytest use conventions to find tests automatically. pytest looks for files named test_*.py or *_test.py, then collects functions starting with test_ and classes starting with Test. Place your tests in a tests/ directory:

my-project/
├── src/
│   └── mypackage/
│       ├── __init__.py
│       └── calculator.py
├── tests/
│   ├── test_calculator.py
│   └── test_utils.py
└── pyproject.toml

Run pytest from the project root and it finds everything. Run pytest tests/test_calculator.py to run a single file. Run pytest -k "add" to run only tests whose names contain "add". The -v flag shows each test name and its result.

Mocking

Sometimes the code you are testing depends on something you cannot or should not use in a test — a database, an API, the filesystem, or the current time. Mocking replaces these dependencies with controlled substitutes.

The unittest.mock module provides patch and MagicMock:

from unittest.mock import patch, MagicMock

def get_weather(city):
    """Calls an external API — we don't want this in tests."""
    import requests
    response = requests.get(f"https://api.weather.com/{city}")
    return response.json()

def format_forecast(city):
    data = get_weather(city)
    return f"{city}: {data['temperature']}°C"

# In the test, we replace get_weather with a mock
@patch("__main__.get_weather")
def test_format_forecast(mock_weather):
    mock_weather.return_value = {"temperature": 22, "condition": "sunny"}
    result = format_forecast("London")
    assert result == "London: 22°C"
    mock_weather.assert_called_once_with("London")

patch temporarily replaces a name in a module with a mock object. The mock records every call made to it — what arguments were passed, how many times it was called, and what it returned. This lets you test your code in isolation without hitting real services.

Use mocking sparingly. Over-mocked tests are brittle and test the implementation rather than the behaviour. If you find yourself mocking everything, your code might have too many dependencies.

Code Coverage

Code coverage measures which lines of your code are executed during tests. The tool coverage.py (install with pip install coverage) tracks this:

coverage run -m pytest
coverage report
coverage html          # generates a browsable HTML report

The report shows each file, the number of lines, and the percentage executed. Lines that were never run are highlighted. Coverage is useful for finding untested code — if an entire function has zero coverage, you probably forgot to write tests for it.

But 100% coverage is a trap. It means every line ran, not that every line was tested correctly. A test that calls a function but never checks the return value achieves coverage without catching bugs. Coverage tells you what you definitely did not test; it cannot tell you that what you did test is sufficient. Treat coverage as a diagnostic tool, not a target.

doctest: Tests in Documentation

doctest lets you embed tests directly in docstrings:

def factorial(n):
    """Return the factorial of n.

    >>> factorial(5)
    120
    >>> factorial(0)
    1
    >>> factorial(1)
    1
    """
    if n <= 1:
        return 1
    return n * factorial(n - 1)

Run with python -m doctest mymodule.py or integrate with pytest using the --doctest-modules flag. Doctests serve double duty: they document how a function works and they verify that the documentation is accurate. The downside is that complex tests become unwieldy in docstrings. Use doctests for simple examples that illustrate usage; use pytest for thorough testing.

Test-Driven Development

Test-driven development (TDD) inverts the usual workflow: you write the test first, watch it fail, then write the minimum code to make it pass, then refactor. The cycle is called Red-Green-Refactor — red for a failing test, green for passing, refactor for cleanup.

TDD sounds extreme, but it has a subtle benefit beyond catching bugs. Writing the test first forces you to think about the interface before the implementation. What arguments should this function take? What should it return? What should happen with bad input? These are design questions, and answering them before you write the code produces cleaner, more usable APIs.

You do not have to practice TDD religiously. But try it. Write a test for a function that does not exist yet, watch the NameError, implement the function, watch the test pass. The rhythm is surprisingly satisfying, and the resulting code — code that was designed to be testable from the very first line — tends to be better structured than code written without tests in mind.

Testing is not a phase that happens after development. It is part of development. The best programmers do not write code and then test it; they write tests and code together, in a continuous feedback loop that catches mistakes early and builds confidence with every passing suite. A comprehensive test suite is not a burden — it is a safety net that lets you move fast without breaking things.