A repeatable SciPy workflow: define the problem (statistical test? fit? solve?) → prepare NumPy arrays (units, NaNs removed) → call the right submodule → interpret outputs (p-values, coefficients, residuals) → document assumptions.
Problem → submodule map
- Compare two groups →
stats.ttest_indor nonparametric equivalents - Fit a curve →
optimize.curve_fit - Solve Ax = b →
linalg.solveor sparse solvers - Integrate ODE →
integrate.solve_ivp - Filter noise →
signaldesign +filtfilt
Prepare data first
Remove or impute NaNs in Pandas before export. Check sample sizes, independence assumptions, and measurement units. Wrong inputs produce valid-looking but meaningless p-values.
Inspect results
import numpy as np
from scipy import stats
group_a = np.array([2.1, 2.3, 2.0, 2.4])
group_b = np.array([2.8, 2.9, 3.1, 2.7])
result = stats.ttest_ind(group_a, group_b)
print('statistic:', result.statistic)
print('pvalue:', result.pvalue)
Next steps in this track
Modules 02–05 cover stats, optimization, linear algebra, and signal/integration. Module 06 previews Pandas/sklearn/engineering handoffs; module 07 prepares interviews and production habits before DSA and AI.
Important interview questions and answers
- Q: Why document assumptions?
A: Tests and optimizers assume conditions (normality, convexity)—violations invalidate conclusions. - Q: p-value interpretation?
A: Probability of observing data at least this extreme if null hypothesis is true—not P(null is true).
Self-check
- List four steps in the SciPy workflow.
- Which submodule handles two-sample t-tests?
Challenge
Trace one SciPy call
- Run the workflow lesson code.
- Write which submodule answers your question (stats, optimize, linalg).
Done when: you can map a problem statement to the right SciPy submodule.
Interview prep
- Workflow steps?
Define problem → prepare ndarray → call submodule → interpret → document assumptions.
- p-value?
Evidence against null given data—not P(null is true).