A Series is a one-dimensional labeled array; a DataFrame is a collection of Series sharing a row index. Both preserve labels through operations—unlike raw NumPy arrays.
Creating a Series
import pandas as pd
import numpy as np
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'], name='score')
print(s)
print(s.index, s.dtype)
Creating a DataFrame
- From dict of lists:
pd.DataFrame({'col': [1,2,3]}) - From dict of Series: columns align on index automatically
- From 2D NumPy array: pass
columns=andindex= - From list of dicts (JSON-like records):
pd.DataFrame(records)
Key attributes
.index— row labels.columns— column names (DataFrame only).values/.to_numpy()— underlying array.shape,.dtypes,.ndim
Important interview questions and answers
- Q: Dict of lists vs list of dicts?
A: Dict of lists = column-oriented; list of dicts = row-oriented (API JSON). Both produce DataFrames. - Q: Shared index?
A: When building from Series, Pandas aligns rows by index label—missing labels become NaN.
Self-check
- Create a Series with a custom index.
- How do you list all column names?
Tip: Print type(df['col']) vs type(df[['col']]) once—it prevents selection bugs.
Interview prep
- Shared index?
Building from dict of Series aligns rows by index label.
- columns attribute?
Index object listing column names on DataFrame.