pivot / pivot_table widen data (long → wide); melt tidies data (wide → long). Essential for report-style tables and tidy-data pipelines.
pivot_table
import pandas as pd
df = pd.DataFrame({'date': ['Jan','Jan','Feb'],
'product': ['A','B','A'],
'sales': [10, 20, 15]})
wide = df.pivot_table(index='date', columns='product',
values='sales', aggfunc='sum', fill_value=0)
print(wide)
melt
long = wide.reset_index().melt(id_vars='date',
var_name='product', value_name='sales')
print(long)
When to use
- pivot_table — cross-tab summaries with aggregation (handles duplicates)
- pivot — strict reshape when index/column pairs are unique
- melt — stack columns into variable/value pairs for plotting or modeling
Important interview questions and answers
- Q: pivot vs pivot_table?
A: pivot_table aggregates duplicates; pivot errors on duplicate index/column keys. - Q: Tidy data?
A: One row per observation—melt helps convert wide spreadsheets to tidy form.
Self-check
- Create a pivot_table of sales by date and product.
- Melt a wide table back to long format.
Pitfall: Use pivot_table when duplicate index/column pairs exist—strict pivot errors.
Interview prep
- pivot_table vs pivot?
pivot_table aggregates duplicates; strict pivot errors on dup keys.
- melt?
Wide to long—tidy data for modeling and plotting.