Data Structures — Textbook of Python

Learning Objectives

Create and manipulate lists using indexing, slicing, and common methods
Compare tuples and lists, and apply tuple unpacking in practical contexts
Construct dictionaries and use dictionary comprehensions to transform data
Apply set operations to solve problems involving uniqueness and membership
Select the appropriate data structure for a given problem based on its characteristics

A single variable holds a single value — one number, one string, one truth. But the interesting problems in programming almost always involve collections: a list of students, a mapping of country codes to names, a set of unique tags. Python ships with four built-in collection types that, between them, cover the vast majority of what you will ever need. Learning when to reach for each one is as important as learning how to use them.

Lists

A list is an ordered, mutable collection of items. You create one with square brackets:

fruits = ["apple", "banana", "cherry"]
numbers = [1, 2, 3, 4, 5]
mixed = [42, "hello", True, 3.14]
empty = []

Lists can hold any mix of types, though in practice you usually keep them homogeneous. They preserve insertion order and allow duplicates.

Indexing works from zero, and negative indices count from the end:

print(fruits[0])     # Output: apple
print(fruits[-1])    # Output: cherry

Slicing extracts a portion of the list. The syntax is list[start:stop:step], where stop is exclusive:

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[2:5])      # Output: [2, 3, 4]
print(numbers[:3])        # Output: [0, 1, 2]
print(numbers[7:])        # Output: [7, 8, 9]
print(numbers[::2])       # Output: [0, 2, 4, 6, 8]
print(numbers[::-1])      # Output: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Because lists are mutable, you can change them in place:

fruits[1] = "blueberry"
print(fruits)    # Output: ['apple', 'blueberry', 'cherry']

fruits.append("date")
fruits.insert(1, "avocado")
fruits.remove("cherry")
last = fruits.pop()          # removes and returns the last item

Other essential methods include sort() (in-place sort), reverse(), extend() (append another list), index() (find position), and count() (count occurrences). The built-in len() gives you the length.

Tuples

A tuple is an ordered, immutable collection. You create one with parentheses — or just commas:

point = (3, 7)
rgb = (255, 128, 0)
singleton = (42,)      # the comma makes it a tuple, not just parentheses
also_tuple = 1, 2, 3   # parentheses are optional

Tuples support indexing and slicing exactly like lists, but you cannot change them after creation. No append, no remove, no item assignment:

point[0] = 5    # TypeError: 'tuple' object does not support item assignment

Why use a type that does less? Because immutability is a feature, not a limitation. Tuples are hashable (so they can be dictionary keys), they signal intent ("this data should not change"), and they are slightly more memory-efficient than lists.

Tuple unpacking is one of Python's most elegant features. You can assign a tuple's elements to multiple variables in a single statement:

x, y = (3, 7)
name, age, city = ("Alice", 30, "London")

# Swap two variables — no temporary needed
a, b = 1, 2
a, b = b, a
print(a, b)    # Output: 2 1

The collections module provides namedtuple, which gives each position a name:

from collections import namedtuple

Point = namedtuple("Point", ["x", "y"])
p = Point(3, 7)
print(p.x, p.y)    # Output: 3 7

Named tuples combine the immutability of tuples with the readability of attribute access. They are ideal for simple data records where a full class would be overkill.

Dictionaries

A dictionary is an unordered (since Python 3.7, insertion-ordered) collection of key-value pairs. You create one with curly braces:

student = {
    "name": "Alice",
    "age": 20,
    "grades": [85, 92, 78]
}

print(student["name"])     # Output: Alice

Keys must be hashable (strings, numbers, tuples are fine; lists are not). Values can be anything. Accessing a missing key with [] raises a KeyError, so use .get() for safe access:

print(student.get("email", "not provided"))  # Output: not provided

Adding and updating entries uses the same syntax:

student["email"] = "alice@example.com"    # add
student["age"] = 21                        # update

Essential methods include keys(), values(), items() (returns key-value pairs), pop() (remove and return), and update() (merge another dict).

Iterating over a dictionary gives you the keys by default:

for key in student:
    print(f"{key}: {student[key]}")

# More idiomatic: iterate over items
for key, value in student.items():
    print(f"{key}: {value}")

Dictionary comprehensions let you build dictionaries concisely:

squares = {n: n ** 2 for n in range(6)}
print(squares)    # Output: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Invert a dictionary
inverted = {v: k for k, v in squares.items()}

Sets

A set is an unordered collection of unique elements. You create one with curly braces or the set() constructor:

colours = {"red", "green", "blue"}
numbers = set([1, 2, 2, 3, 3, 3])
print(numbers)    # Output: {1, 2, 3} — duplicates removed

The defining property of sets is uniqueness. Adding a duplicate has no effect. This makes sets perfect for deduplication, membership testing, and mathematical set operations.

a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

print(a | b)     # Union: {1, 2, 3, 4, 5, 6}
print(a & b)     # Intersection: {3, 4}
print(a - b)     # Difference: {1, 2}
print(a ^ b)     # Symmetric difference: {1, 2, 5, 6}

Membership testing with in is extremely fast on sets — O(1) on average, compared to O(n) for lists. If you need to check "is this value in this collection?" repeatedly, convert your list to a set first.

A frozenset is an immutable set. Like the relationship between lists and tuples, frozensets are hashable and can be used as dictionary keys or elements of other sets.

List Comprehensions

List comprehensions are a concise way to create lists from existing iterables:

squares = [x ** 2 for x in range(10)]
print(squares)    # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

You can add a condition to filter elements:

evens = [x for x in range(20) if x % 2 == 0]
print(evens)      # Output: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

And you can nest loops:

pairs = [(x, y) for x in range(3) for y in range(3)]
print(pairs)
# Output: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

Comprehensions exist for dictionaries and sets too:

word_lengths = {word: len(word) for word in ["hello", "world", "python"]}
unique_lengths = {len(word) for word in ["hello", "world", "python"]}

A comprehension replaces the pattern of creating an empty collection, looping, and appending. Use them when the logic fits on one or two lines. If the comprehension becomes deeply nested or hard to read, a regular loop is clearer.

Nested Data Structures

Python's data structures compose naturally. Lists of dictionaries, dictionaries of lists, sets inside tuples — any combination is valid:

students = [
    {"name": "Alice", "grades": [85, 92, 78]},
    {"name": "Bob", "grades": [90, 88, 95]},
    {"name": "Charlie", "grades": [70, 75, 80]},
]

for student in students:
    average = sum(student["grades"]) / len(student["grades"])
    print(f"{student['name']}: {average:.1f}")
# Output:
# Alice: 85.0
# Bob: 91.0
# Charlie: 75.0

Deeply nested structures can become hard to navigate. When you find yourself writing data["users"][0]["address"]["city"], it might be time to define a class or use a library like dataclasses — but for configuration data, JSON payloads, and quick scripts, nested dicts and lists are perfectly pragmatic.

Choosing the Right Structure

Picking the right collection is less about memorising rules and more about understanding what operations you need.

Use a list when you have an ordered sequence of items that may change — a queue of tasks, a series of measurements, rows from a file. Use a tuple when the sequence is fixed and you want to signal that — coordinates, RGB colours, function return values. Use a dictionary when you need to look things up by a key — configurations, caches, any name-to-value mapping. Use a set when uniqueness matters or you need fast membership tests — deduplicating data, tracking seen items, computing intersections.

The collections Module

The standard library's collections module provides specialised containers that extend the built-ins.

Counter counts hashable objects:

from collections import Counter

words = "the cat sat on the mat the cat".split()
counts = Counter(words)
print(counts.most_common(2))    # Output: [('the', 3), ('cat', 2)]

defaultdict provides a default value for missing keys, eliminating the need for "check-then-set" patterns:

from collections import defaultdict

grouped = defaultdict(list)
for word in ["apple", "banana", "avocado", "blueberry", "cherry"]:
    grouped[word[0]].append(word)
print(dict(grouped))
# Output: {'a': ['apple', 'avocado'], 'b': ['banana', 'blueberry'], 'c': ['cherry']}

deque (double-ended queue) provides O(1) appends and pops from both ends, unlike lists which are O(n) for pop(0):

from collections import deque

queue = deque(["first", "second", "third"])
queue.append("fourth")       # add to right
queue.appendleft("zeroth")   # add to left
queue.popleft()               # remove from left: "zeroth"

These specialised containers are not things you need every day, but knowing they exist saves you from reinventing them when you do need them.

Data structures are the skeleton of every program. The algorithms you write are only as good as the structures they operate on — and in Python, the built-in structures are so capable that you can go remarkably far without ever importing anything. Choose well, and your code will be fast, clear, and a pleasure to maintain. Choose poorly, and you will spend your time fighting the data instead of solving the problem.