A Series is pandas' one-dimensional labelled array, capable of holding any data type — integers, floats, strings, Python objects, or a mix. Each element has an index label (defaulting to 0, 1, 2, ... if not specified). A DataFrame is essentially a dictionary of Series that share the same index, so understanding Series is prerequisite to understanding DataFrames.
Series support vectorised operations (element-wise arithmetic without explicit loops), alignment by index (operations between two Series automatically match by label), and the full complement of pandas methods: .mean(), .sum(), .value_counts(), .apply(), .map(), .str (string methods accessor), .dt (datetime accessor), .fillna(), .dropna(), and many more.
You create a Series from a list (pd.Series([1, 2, 3])), a dictionary (pd.Series({'a': 1, 'b': 2})), a scalar (pd.Series(5, index=['x', 'y', 'z'])), or by selecting a single column from a DataFrame (df['column_name']). Series are heavily used in data filtering (df[df['age'] > 30] creates a boolean Series), aggregation, and plotting. They bridge the gap between NumPy arrays (no labels, pure numerical) and DataFrames (two-dimensional, tabular).
Discussed in:
- Chapter 16: Working with Data — pandas: The Data Analysis Library