Series — Glossary — Textbook of Python

A Series is pandas' one-dimensional labelled array, capable of holding any data type — integers, floats, strings, Python objects, or a mix. Each element has an index label (defaulting to 0, 1, 2, ... if not specified). A DataFrame is essentially a dictionary of Series that share the same index, so understanding Series is prerequisite to understanding DataFrames.

Series support vectorised operations (element-wise arithmetic without explicit loops), alignment by index (operations between two Series automatically match by label), and the full complement of pandas methods: .mean(), .sum(), .value_counts(), .apply(), .map(), .str (string methods accessor), .dt (datetime accessor), .fillna(), .dropna(), and many more.

You create a Series from a list (pd.Series([1, 2, 3])), a dictionary (pd.Series({'a': 1, 'b': 2})), a scalar (pd.Series(5, index=['x', 'y', 'z'])), or by selecting a single column from a DataFrame (df['column_name']). Series are heavily used in data filtering (df[df['age'] > 30] creates a boolean Series), aggregation, and plotting. They bridge the gap between NumPy arrays (no labels, pure numerical) and DataFrames (two-dimensional, tabular).

Related terms: DataFrame, pandas, NumPy

Discussed in:

Chapter 16: Working with Data — pandas: The Data Analysis Library

This site is currently in Beta. Please email Chris Paton (cpaton@gmail.com) with any suggestions, questions or comments.