Skip to content
Learn Netverks

Lesson

Step 19/36 53% through track

aggregations-groupby

Aggregations with groupby

Last reviewed Jun 1, 2026 Content v20260601
Track mode
server_script
Means
Server runner
Reading
~1 min
Level
intermediate

This lesson

This lesson teaches Aggregations with groupby: Pandas tabular manipulation—indexing, dtypes, reshaping, and analysis habits for real-world tables.

Split-apply-combine is the core analytics pattern—wrong keys or missing `observed=True` skew aggregates.

You will apply Aggregations with groupby in contexts like: Cohort KPIs, funnel breakdowns, and executive summary tables.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs. Also verify row counts before and after joins or aggregations.

When you can explain the previous lesson's ideas in your own words.

Go beyond single sum or mean: use agg for multiple functions, custom lambdas, and column-specific aggregations in one call.

agg syntax

import pandas as pd
df = pd.DataFrame({'g': ['A','A','B'], 'x': [1,2,3], 'y': [10,20,30]})
result = df.groupby('g').agg(
    x_sum=('x', 'sum'),
    y_mean=('y', 'mean'),
)
print(result)

Built-in aggregations

  • sum, mean, median, std, count
  • min, max, first, last, nunique
  • size — count including NaN groups

Named aggregation (pandas ≥ 0.25)

Named tuples in agg produce readable column names—preferred in production over multi-index columns from list-of-funcs style.

Important interview questions and answers

  1. Q: count vs size?
    A: count excludes NaN per column; size counts all rows in group including NaN.
  2. Q: Multiple columns different funcs?
    A: Pass dict: {'col1': 'sum', 'col2': ['min', 'max']}.

Self-check

  1. Compute sum and mean of one column by group.
  2. Use named aggregation syntax.

Tip: Named aggregation ('col', 'sum') produces readable output column names.

Interview prep

Named agg?

col_sum=('col', 'sum') produces readable column names.

count vs size?

count excludes NaN per column; size counts all rows in group.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • agg dict?
  • Multiple metrics?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump