Skip to content
Learn Netverks

Lesson

Step 15/36 42% through track

missing-data-basics

Missing data basics

Last reviewed Jun 1, 2026 Content v20260601
Track mode
server_script
Means
Server runner
Reading
~2 min
Level
beginner

This lesson

This lesson teaches Missing data basics: the data science mindset, methods, and communication habits behind evidence-based decisions.

Missing data mechanisms (MCAR/MAR/MNAR) decide whether imputation is safe—blind fill creates false confidence.

You will apply Missing data basics in contexts like: Messy CSV exports, API logs, and survey data before any dashboard ships.

Read the narrative, run Python in the playground (stdlib snippets now; install Jupyter, pandas, and scikit-learn locally for full notebooks), and complete MCQs to lock in vocabulary.

When you can explain the previous lesson's ideas in your own words.

Missing values are gaps in your table: empty cells, None in Python, NULL in SQL, or sentinel codes like -1 meaning “unknown.” How you handle them changes model behavior and metrics.

Types of missingness

  • MCAR — missing completely at random (rare in practice)
  • MAR — missing depends on observed columns
  • MNAR — missing depends on unobserved or the value itself (hardest)

Example MNAR: high earners skip income survey questions more often—dropping rows biases averages downward.

Audit missingness first

  1. Count missing per column
  2. Cross-tab missing flags with target or segment
  3. Check if “missing” is informative (create indicator features)

Common strategies (preview)

  • Drop rows — only if few rows and MCAR-like
  • Impute — median/mode, or model-based (advanced)
  • Separate category — “unknown” for categoricals

Cleaning lessons cover imputation workflow; never impute on full data before splitting train/test.

Important interview questions and answers

  1. Q: Why MNAR matters?
    A: Imputing without modeling why data are missing can bias conclusions.
  2. Q: Missing indicator feature?
    A: Binary column marking imputation—sometimes improves models when missingness is informative.

Self-check

  1. What does NULL mean in SQL?
  2. Name two strategies for missing numeric data.
  3. Why audit missingness before imputing?

Tip: Ask why data is missing before filling—MNAR is common.

Interview prep

MCAR?

Missing completely at random—rare in practice.

Impute blindly?

Understand why missing before filling.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • Why missing?
  • Impute risk?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump