Beyond np.mean, stats.describe returns count, mean, variance, skewness, and kurtosis in one call—useful for quick EDA on ndarray exports from Pandas.
Key functions
stats.describe— nobs, minmax, mean, variance, skewness, kurtosisstats.tstd,stats.tvar— sample std/var (ddof=1 default)stats.sem— standard error of the meanstats.iqr— robust spread measurestats.zscore— standardize for comparison
describe vs describe
Pandas df.describe() is tabular and column-wise. stats.describe targets one numeric array—ideal after series.to_numpy().
Example
import numpy as np
from scipy import stats
x = np.array([10, 12, 11, 15, 9, 14, 13])
d = stats.describe(x)
print('nobs:', d.nobs, 'mean:', d.mean, 'var:', d.variance)
Important interview questions and answers
- Q: skewness meaning?
A: Asymmetry: positive skew = long right tail; affects choice of mean vs median. - Q: zscore use?
A: Compare values on different scales; watch outliers—zscore is sensitive.
Self-check
- What five quantities does stats.describe return?
- When prefer median and IQR over mean and std?
Tip: Pair stats.describe with a histogram mentally—skewness without a plot misleads.
Interview prep
- describe?
nobs, minmax, mean, variance, skewness, kurtosis in one call.
- zscore?
Standardize for comparison—sensitive to outliers.