What is Pandas?

Last reviewed May 28, 2026 Content v20260528

Track mode

server_script

Means

Server runner

Reading

~2 min

Level

beginner

This lesson

This lesson teaches What is Pandas?: Pandas tabular manipulation—indexing, dtypes, reshaping, and analysis habits for real-world tables.

This track orients workflow; NumPy/Pandas tracks teach the tools you will use daily in notebooks.

You will apply What is Pandas? in contexts like: CSV/Parquet analysis, ETL notebooks, and ad hoc reporting.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs.

After /python/intro and /numpy/intro—when you are ready for labeled tables and daily wrangling workflows.

Pandas (Panel Data) provides high-performance, easy-to-use data structures for structured data. The two workhorses are Series (1D labeled array) and DataFrame (2D labeled table).

Core concepts

Series — one column with an index (row labels)
DataFrame — table of Series sharing the same index
Index — row labels; can be integers, strings, or datetimes
Alignment — operations match on labels, not just position
Vectorization — column math delegates to NumPy under the hood

Typical use cases

Exploratory data analysis (EDA) on CSV/Parquet/SQL results
Cleaning, filtering, and aggregating business metrics
Feature engineering before machine learning
Time-series analysis (sales, sensors, logs)

Import convention

import pandas as pd
import numpy as np

s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
df = pd.DataFrame({'x': s, 'y': [1, 2, 3]})
print(s)
print(df)

Important interview questions and answers

Q: Series vs DataFrame?
A: Series is 1D with one index; DataFrame is 2D with shared row index and multiple named columns.
Q: Why import as pd?
A: Universal convention across documentation, Stack Overflow, and production codebases.

Self-check

Name the two primary Pandas data structures.
Give one real-world use case for a DataFrame.

Tip: Remember: one bracket df['col'] → Series; two brackets → DataFrame.

Interview prep

Series vs DataFrame?: Series is one column with index; DataFrame is multiple aligned columns.
Alignment?: Operations match on index labels—can introduce NaN where labels differ.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Code runner not available

Server runner is disabled. Set LEARNING_RUNNER_ENABLED=true and LEARNING_RUNNER_URL in .env (see .env.example).

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

Labeled axes?
Homogeneous columns?

No discussion yet. Be the first to ask a question.