Skip to content
Learn Netverks

Lesson

Step 18/36 50% through track

groupby-intro

GroupBy intro

Last reviewed May 28, 2026 Content v20260528
Track mode
server_script
Means
Server runner
Reading
~1 min
Level
intermediate

This lesson

An orientation to the Pandas track—Series, DataFrames, wrangling, groupby, merges, and links to SciPy/sklearn next.

You need labeled-table fluency before sklearn and production ETL—otherwise groupby, merge, and datetime bugs dominate every sprint.

You will apply GroupBy intro in contexts like: Cohort KPIs, funnel breakdowns, and executive summary tables.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs. Also read the interview prep blocks; print `df.shape`, `df.dtypes`, and `df.head()` after every transform; verify row counts before and after joins or aggregations.

After /python/intro and /numpy/intro—when you are ready for labeled tables and daily wrangling workflows.

groupby splits a DataFrame into groups by key column(s), applies a function per group, and combines results—the Pandas equivalent of SQL GROUP BY.

Split-apply-combine

import pandas as pd
df = pd.DataFrame({'dept': ['S','S','E'], 'sales': [100, 150, 200]})
totals = df.groupby('dept')['sales'].sum()
print(totals)

Multiple keys

df.groupby(['region', 'dept'])['sales'].mean()

as_index=False

df.groupby('dept', as_index=False)['sales'].sum() keeps group keys as columns—easier for merges and plotting.

Important interview questions and answers

  1. Q: groupby object?
    A: Lazy split—aggregation triggers computation; inspect with .groups dict.
  2. Q: Multiple aggregations?
    A: Use .agg(['sum', 'mean']) or named dict per column.

Self-check

  1. Group by department and sum sales.
  2. Why use as_index=False?

Tip: Use as_index=False so group keys stay columns for merges and plots.

Interview prep

Split-apply-combine?

Split by key, apply aggregation, combine into result.

as_index=False?

Keeps group keys as columns—easier for downstream merge/plot.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • split-apply-combine?
  • as_index False?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump