Skip to content
Learn Netverks

Lesson

Step 11/36 31% through track

missing-values-intro

Missing values intro

Last reviewed May 28, 2026 Content v20260528
Track mode
server_script
Means
Server runner
Reading
~1 min
Level
beginner

This lesson

An orientation to the Pandas track—Series, DataFrames, wrangling, groupby, merges, and links to SciPy/sklearn next.

You need labeled-table fluency before sklearn and production ETL—otherwise groupby, merge, and datetime bugs dominate every sprint.

You will apply Missing values intro in contexts like: CSV/Parquet analysis, ETL notebooks, and ad hoc reporting.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs. Also read the interview prep blocks; print `df.shape`, `df.dtypes`, and `df.head()` after every transform.

After /python/intro and /numpy/intro—when you are ready for labeled tables and daily wrangling workflows.

Pandas represents missing data with NaN (float columns) or pd.NA (nullable dtypes). Detect with isna(); handle with drop, fill, or forward-fill strategies.

Detection

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': [1, np.nan, 3], 'b': ['x', None, 'z']})
print(df.isna())
print(df.isna().sum())

Handling strategies

  • df.dropna() — remove rows/cols with any NaN
  • df.fillna(0) or df.fillna({'col': median})
  • df.interpolate() — fill numeric gaps linearly
  • df.ffill() / df.bfill() — propagate last/next valid

Best practice

Document your missing-data policy. Dropping all NaN rows can bias results; imputing with mean/median/mode depends on domain. Never compare with == np.nan—use isna().

Important interview questions and answers

  1. Q: None vs NaN?
    A: None is Python object; NaN is float missing marker—both detected by isna() in most cases.
  2. Q: dropna subset?
    A: Pass column list to only require non-null in key columns.

Self-check

  1. Count missing values per column.
  2. Fill numeric NaN with column median.

Tip: Never test x == np.nan—always df.isna().

Interview prep

isna?

Correct detection—never compare with == np.nan.

Imputation policy?

Document drop vs fill vs median—domain decision, not one-size-fits-all.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • isna vs isnull?
  • dropna axis?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump