Hypothesis tests answer: is this pattern likely under a null model? SciPy returns a test statistic and p-value. Always pair with effect size and domain context from Data Science ethics.
Null and alternative
- Null (H₀) — default claim (e.g. two groups have equal means)
- Alternative (H₁) — what you seek evidence for
- α — significance threshold (often 0.05)—not a magic line
- p-value — evidence against H₀ given the data and model assumptions
Common tests
ttest_ind— two independent samples (Welch variant for unequal variance)ttest_rel— paired measurementsmannwhitneyu— nonparametric two-samplechi2_contingency— categorical association
Two-sample example
import numpy as np
from scipy import stats
a = np.array([1.2, 1.5, 1.1, 1.4])
b = np.array([1.8, 1.9, 2.0, 1.7])
t = stats.ttest_ind(a, b, equal_var=False)
print(t)
Important interview questions and answers
- Q: p-value = 0.03 means?
A: If H₀ were true, ~3% of repeats would show a statistic at least this extreme—not 3% chance H₀ is false. - Q: When nonparametric?
A: Skewed data, outliers, or ordinal scales—Mann-Whitney instead of t-test.
Self-check
- Define null hypothesis in one sentence.
- When use Welch's t-test (equal_var=False)?
Tip: Always report effect size and sample n—not only p-values from ttest_ind.
Interview prep
- Welch t-test?
ttest_ind with equal_var=False for unequal variances.
- Paired data?
ttest_rel when same subject measured twice.