Skip to content
Learn Netverks

Lesson

Step 24/36 67% through track

categorical-data

Categorical data

Last reviewed Jun 1, 2026 Content v20260601
Track mode
server_script
Means
Server runner
Reading
~1 min
Level
intermediate

This lesson

This lesson teaches Categorical data: Pandas tabular manipulation—indexing, dtypes, reshaping, and analysis habits for real-world tables.

Teams apply Categorical data in every serious Pandas project—skipping it leaves blind spots in analysis and reviews.

You will apply Categorical data in contexts like: CSV/Parquet analysis, ETL notebooks, and ad hoc reporting.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs.

When you can explain the previous lesson's ideas in your own words.

The category dtype stores low-cardinality strings efficiently and enables ordered comparisons. Convert with astype('category') or pd.Categorical when you have fixed label sets.

Why categorical?

  • Lower memory vs object strings on repeated values
  • Faster groupby and sort on known categories
  • Enforce valid values (typos become NaN if not in categories)
  • Ordered categories for ranking (small < medium < large)

Creating ordered categories

import pandas as pd
size_order = ['S', 'M', 'L']
cat = pd.Categorical(['M', 'S', 'L'], categories=size_order, ordered=True)
print(cat.sort_values())

In DataFrames

df = pd.DataFrame({'size': ['M', 'S', 'M', 'L']})
df['size'] = df['size'].astype('category')
print(df['size'].cat.categories)

Important interview questions and answers

  1. Q: When not to use?
    A: High-cardinality unique strings (user IDs, timestamps)—object or string dtype better.
  2. Q: cat accessor?
    A: .cat.categories, .cat.codes, .cat.reorder_categories for ordered ops.

Self-check

  1. Convert a column to category dtype.
  2. Create an ordered categorical for sizes.

Tip: Convert low-cardinality strings to category before large groupby operations.

Interview prep

When category?

Low-cardinality repeated strings—memory and groupby speed.

Ordered?

Enables sort/compare by business order (S < M < L).

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • Memory win?
  • ordered category?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump