STATISTICS CHEATSHEET

Comprehensive review guide with Python code examples

Shapiro-WilkLevene T-TestZ-Score ANOVAChi-Square A/B TestRegression BayesianPower Analysis

📑 Table of Contents

01Descriptive Statistics 02Probability Distributions 03Z-Score 04Confidence Intervals 05Hypothesis Testing 06Normality Tests 07Variance Homogeneity — Levene 08T-Test 09Z-Test 10ANOVA 11Chi-Square Test 12Correlation & Regression 13Non-Parametric Tests 14Effect Size 15Power Analysis 16A/B Testing 17Bayesian Basics 18Which Test to Use?

📐 1. Descriptive Statistics

Measures of Central Tendency

Measure	Formula	When?
Mean	`x̄ = Σxᵢ / n`	Symmetric distributions
Median	Middle of sorted data	Skewed data, outliers
Mode	Most frequent value	Categorical data

Spread & Shape

Sample Variances² = Σ(xᵢ - x̄)² / (n - 1)

Standard Deviations = √s²

Measure	= 0	> 0	< 0
Skewness	Symmetric	Right-skewed	Left-skewed
Kurtosis	Normal	Heavy-tailed	Light-tailed

Pythonimport numpy as np
from scipy import stats

data = [23, 45, 12, 67, 34, 89, 56, 78, 90, 43]
print(f"Mean:     {np.mean(data):.2f}")
print(f"Median:   {np.median(data):.2f}")
print(f"Std Dev:  {np.std(data, ddof=1):.2f}")
print(f"Skewness: {stats.skew(data):.4f}")
print(f"Kurtosis: {stats.kurtosis(data):.4f}")

🔔 2. Probability Distributions

Normal Distribution

Central Limit Theorem: When n≥30, sample means are approximately normal.

68-95-99.7 Rule68% → μ ± 1σ 95% → μ ± 2σ 99.7% → μ ± 3σ

Binomial Distribution

PMFP(X = k) = C(n,k) · p^k · (1-p)^(n-k)

Poisson Distribution

PMFP(X = k) = (λ^k · e^(-λ)) / k!

Pythonfrom scipy.stats import norm, binom, poisson

print(f"P(Z < 1.96) = {norm.cdf(1.96):.4f}")
print(f"Binom(10,0.5) P(X=6) = {binom.pmf(6, 10, 0.5):.4f}")
print(f"Poisson(3) P(X=5) = {poisson.pmf(5, 3):.4f}")

📊 3. Z-Score

How many standard deviations a value is from the mean.

Single Valuez = (x - μ) / σ

Sample Meanz = (x̄ - μ₀) / (σ / √n)

z	One-tailed P	Two-tailed P
1.645	0.050	0.100
1.960	0.025	0.050
2.576	0.005	0.010

Pythonfrom scipy.stats import norm

x, mu, sigma = 85, 70, 10
z = (x - mu) / sigma
print(f"z = {z:.2f}")
print(f"P(Z < {z})  = {norm.cdf(z):.4f}")
print(f"P(Z > {z})  = {1 - norm.cdf(z):.4f}")
print(f"Two-tailed  = {2*(1 - norm.cdf(abs(z))):.4f}")
print(f"Critical z (α=0.05) = {norm.ppf(0.975):.4f}")

🎯 4. Confidence Intervals

σ known or n ≥ 30CI = x̄ ± z* · (σ / √n)

σ unknown and n < 30CI = x̄ ± t* · (s / √n)

Level	z*
90%	1.645
95%	1.960
99%	2.576

Pythonimport numpy as np
from scipy import stats

data = np.random.normal(100, 15, size=50)
se = stats.sem(data)
ci = stats.t.interval(0.95, df=len(data)-1, loc=np.mean(data), scale=se)
print(f"Mean: {np.mean(data):.2f}")
print(f"95% CI: ({ci[0]:.2f}, {ci[1]:.2f})")

Interpretation: If we repeat this 100 times, ~95 of those intervals would contain the true parameter.

⚖️ 5. Hypothesis Testing

Steps

H₀: No effect / no difference
H₁: Effect exists
Set α: Usually 0.05
Compute test statistic
Find p-value
Decision: p < α → Reject H₀

Error Types

	H₀ True	H₀ False
Reject H₀	❌ Type I (α)	✅ Correct (Power)
Fail to Reject	✅ Correct	❌ Type II (β)

📈 6. Normality Tests

Shapiro-Wilk

Most reliable normality test (n < 5000). H₀: Data is normal. p > 0.05 → Normal ✅

Pythonfrom scipy.stats import shapiro, normaltest
import numpy as np

normal_data = np.random.normal(50, 10, 100)
skewed_data = np.random.exponential(5, 100)

stat, p = shapiro(normal_data)
print(f"Shapiro → W={stat:.4f}, p={p:.4f}")
print("Normal ✅" if p > 0.05 else "Not normal ❌")

Practical: When n > 30, CLT makes parametric tests generally safe.

⚖️ 7. Variance Homogeneity — Levene

Tests if groups have equal variances. Prerequisite for t-test and ANOVA.

H₀: σ₁² = σ₂² | p > 0.05 → Homogeneous ✅

Test	Advantage	Disadvantage
Levene	Doesn't assume normality	Slightly less powerful
Bartlett	Powerful under normality	Sensitive to violations

Pythonfrom scipy.stats import levene
import numpy as np

group_a = np.random.normal(50, 10, 50)
group_c = np.random.normal(48, 25, 50)

stat, p = levene(group_a, group_c)
print(f"Levene W={stat:.4f}, p={p:.4f}")
print("Homogeneous ✅" if p > 0.05 else "Heterogeneous ❌")

If not homogeneous: Use equal_var=False (Welch) for t-test, Kruskal-Wallis for ANOVA.

🔬 8. T-Test

One-Sample T-Test

Formulat = (x̄ - μ₀) / (s / √n), df = n - 1

Pythonfrom scipy.stats import ttest_1samp
scores = [78, 82, 85, 90, 74, 88, 92, 79, 83, 87]
t, p = ttest_1samp(scores, popmean=80)
print(f"t={t:.4f}, p={p:.4f} → {'Reject' if p<0.05 else 'Fail to reject'}")

Independent Two-Sample

Prerequisites: ① Normality ② Variance homogeneity ③ Independence

Pythonfrom scipy.stats import ttest_ind, levene
import numpy as np

drug = np.random.normal(120, 15, 30)
placebo = np.random.normal(130, 15, 30)

_, p_lev = levene(drug, placebo)
t, p = ttest_ind(drug, placebo, equal_var=(p_lev > 0.05))
print(f"t={t:.4f}, p={p:.4f}")
print("Significant ✅" if p < 0.05 else "Not significant ❌")

Paired T-Test

Pythonfrom scipy.stats import ttest_rel
before  = [120, 135, 128, 140, 132, 145, 138, 130, 142, 136]
after   = [115, 125, 122, 130, 128, 135, 130, 120, 132, 128]
t, p = ttest_rel(before, after)
print(f"t={t:.4f}, p={p:.4f} → {'Effective ✅' if p<0.05 else 'Ineffective ❌'}")

📏 9. Z-Test

Large-sample (n≥30) version of t-test when σ is known.

Formulaz = (x̄ - μ₀) / (σ / √n)

Pythonimport numpy as np
from scipy.stats import norm

measurements = np.random.normal(503, 10, 50)
z = (np.mean(measurements) - 500) / (10 / np.sqrt(50))
p = 2 * (1 - norm.cdf(abs(z)))
print(f"z={z:.4f}, p={p:.4f}")

Feature	Z-Test	T-Test
σ known?	Yes	No
Sample	n ≥ 30	Any
Distribution	Normal (CLT)	t distribution

📊 10. ANOVA

One-Way ANOVA

Compares 3+ group means. H₀: μ₁ = μ₂ = ... = μₖ

F StatisticF = MSB / MSW = (Between-group variance) / (Within-group variance)

Pythonfrom scipy.stats import f_oneway
import numpy as np

a = np.random.normal(75, 8, 30)
b = np.random.normal(80, 8, 30)
c = np.random.normal(78, 8, 30)

F, p = f_oneway(a, b, c)
print(f"F={F:.4f}, p={p:.4f}")

from statsmodels.stats.multicomp import pairwise_tukeyhsd
data = np.concatenate([a, b, c])
groups = ['A']*30 + ['B']*30 + ['C']*30
print(pairwise_tukeyhsd(data, groups, alpha=0.05))

🎲 11. Chi-Square Test

Test of Independence

Is there a relationship between two categorical variables?

Pythonfrom scipy.stats import chi2_contingency, chisquare
import numpy as np

table = np.array([[50, 30], [20, 100]])
chi2, p, df, expected = chi2_contingency(table)
print(f"χ²={chi2:.4f}, p={p:.6f} → {'Related' if p<0.05 else 'Independent'}")

# Goodness of fit
observed = [18, 22, 20, 25, 15]
chi2, p = chisquare(observed, f_exp=[20]*5)
print(f"χ²={chi2:.4f}, p={p:.4f} → {'Biased' if p<0.05 else 'Fair'}")

📉 12. Correlation & Regression

\|r\|	Interpretation
0.00 – 0.29	Weak
0.30 – 0.69	Moderate
0.70 – 1.00	Strong

Simple Linear Regressionŷ = β₀ + β₁·x R² = Explained / Total variance

Pythonfrom scipy.stats import pearsonr, spearmanr
from sklearn.linear_model import LinearRegression
import numpy as np

x = np.random.uniform(1, 10, 50)
y = 50 + 4*x + np.random.normal(0, 5, 50)

r, p = pearsonr(x, y)
rho, p2 = spearmanr(x, y)
print(f"Pearson r={r:.4f}, Spearman ρ={rho:.4f}")

model = LinearRegression().fit(x.reshape(-1,1), y)
print(f"ŷ = {model.intercept_:.2f} + {model.coef_[0]:.2f}·x")
print(f"R² = {model.score(x.reshape(-1,1), y):.4f}")

🔄 13. Non-Parametric Tests

Parametric	Non-Parametric	Scenario
Independent t	Mann-Whitney U	2 independent groups
Paired t	Wilcoxon	2 dependent groups
One-way ANOVA	Kruskal-Wallis	3+ independent groups

Pythonfrom scipy.stats import mannwhitneyu, wilcoxon, kruskal
import numpy as np

g1 = np.random.exponential(5, 30)
g2 = np.random.exponential(8, 30)

U, p = mannwhitneyu(g1, g2, alternative='two-sided')
print(f"Mann-Whitney U={U:.0f}, p={p:.4f}")

before = [85, 90, 78, 92, 88, 76, 95, 80, 83, 89]
after  = [90, 95, 82, 96, 92, 82, 98, 86, 88, 93]
W, p2 = wilcoxon(before, after)
print(f"Wilcoxon W={W:.0f}, p={p2:.4f}")

📐 14. Effect Size

p-value: "Is there a difference?" → Effect size: "How big?"

Cohen's dd = (x̄₁ - x̄₂) / s_pooled (0.2=Small, 0.5=Medium, 0.8=Large)

Eta-squared (ANOVA)η² = SS_between / SS_total (0.01=Small, 0.06=Medium, 0.14=Large)

Pythonimport numpy as np
def cohens_d(g1, g2):
    n1, n2 = len(g1), len(g2)
    pooled = np.sqrt(((n1-1)*np.var(g1,ddof=1)+(n2-1)*np.var(g2,ddof=1))/(n1+n2-2))
    return (np.mean(g1) - np.mean(g2)) / pooled

d = cohens_d(np.random.normal(120,15,30), np.random.normal(130,15,30))
print(f"Cohen's d = {d:.4f}")

⚡ 15. Power Analysis

Done BEFORE the test. 4 components (give 3, compute 4th):

Effect size (d)
α (0.05)
Power (0.80)
Sample size (n)

Pythonfrom statsmodels.stats.power import TTestIndPower

analysis = TTestIndPower()
for d in [0.2, 0.5, 0.8]:
    n = analysis.solve_power(effect_size=d, alpha=0.05, power=0.8)
    print(f"d={d} → n={n:.0f} (per group)")

🧪 16. A/B Testing

Workflow

State hypothesis
Define metric (conversion, CTR, revenue)
Calculate sample size
Run experiment
Evaluate results

Pythonfrom statsmodels.stats.proportion import proportions_ztest, proportion_confint
from statsmodels.stats.proportion import proportion_effectsize
from statsmodels.stats.power import NormalIndPower
import numpy as np

# Control: 120/1000, Test: 145/1000
z, p = proportions_ztest([120,145], [1000,1000], alternative='smaller')
print(f"z={z:.4f}, p={p:.4f}")
if p < 0.05:
    lift = (145/1000 - 120/1000) / (120/1000) * 100
    print(f"✅ Lift: +{lift:.1f}%")

effect = proportion_effectsize(0.10, 0.12)
n = NormalIndPower().solve_power(effect, alpha=0.05, power=0.80)
print(f"MDE 10%→12%: n = {n:.0f} per group")

Common Pitfalls

Pitfall	Solution
Peeking	Pre-determine n, wait
Multiple testing	Bonferroni: α_new = α/k
Simpson's paradox	Segment analysis
Novelty effect	Wait 2+ weeks

🧠 17. Bayesian Basics

Bayes' TheoremP(A|B) = P(B|A) · P(A) / P(B) → Posterior = Likelihood × Prior / Evidence

Python# Medical test: 99% accuracy, 1% prevalence
p_sick = 0.01
p_pos = 0.99 * 0.01 + 0.01 * 0.99
posterior = (0.99 * 0.01) / p_pos
print(f"P(Sick | Positive) = {posterior:.2%}")
print("→ Even 99% accurate test misleads with rare diseases!")

Frequentist

Probability = long-run frequency
Fixed parameter
p-value, confidence interval

Bayesian

Probability = degree of belief
Random parameter
Posterior, credible interval

🗺️ 18. Which Test to Use?

WHAT IS YOUR DATA TYPE? │ ├── Numerical (Continuous) │ ├── 1 Group → One-sample t-test │ ├── 2 Groups │ │ ├── Independent → Normal? → Yes: t-test | No: Mann-Whitney U │ │ └── Dependent → Normal? → Yes: Paired t | No: Wilcoxon │ └── 3+ Groups │ ├── Independent → Normal? → Yes: ANOVA | No: Kruskal-Wallis │ └── Dependent → Repeated Measures ANOVA / Friedman │ ├── Categorical │ ├── One variable → Chi-Square Goodness of Fit │ └── Two variables → Chi-Square Independence │ └── Relationship ├── Linear? → Normal? → Pearson r | Spearman ρ └── Prediction? → Regression (Simple / Multiple)

Quick Checklist

#	Step	Tool
1	Identify data type	`df.dtypes`
2	Explore distribution	Histogram, QQ-plot
3	Test normality	`shapiro()`
4	Check variance	`levene()`
5	Apply test	Decision tree
6	Effect size	Cohen's d, η²
7	Report	p + effect + CI

Golden Rule: p-value ALONE is not enough. Always report with effect size and confidence intervals!