STATISTICS FORMULA REFERENCE
A comprehensive review guide β Theory & Formulas only
Shapiro-WilkLevene
T-TestZ-Score
ANOVAChi-Square
A/B TestingRegression
BayesianPower Analysis
π1. Descriptive Statistics
Measures of Central Tendency
| Measure |
Formula |
When? |
| Mean |
$\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i$ |
Symmetric distributions |
| Median |
Middle of sorted data |
Skewed data / outliers |
| Mode |
Most frequent value |
Categorical data |
Measures of Spread
Sample Variance$$s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i -
\bar{x})^2$$
Standard Deviation$$s = \sqrt{s^2} =
\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2}$$
π‘ Bessel's correction (n-1): Since the sample mean is estimated from the
data, one degree of freedom is lost.
Shape Measures
| Measure |
=0 |
>0 |
<0 |
| Skewness |
Symmetric |
Right-skewed |
Left-skewed |
| Kurtosis |
Normal (meso) |
Heavy-tailed (lepto) |
Light-tailed (platy) |
π2. Probability Distributions
Normal Distribution (Gaussian)
Probability Density Function$$f(x) = \frac{1}{\sigma\sqrt{2\pi}} \,
e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$$
| Interval |
Coverage |
| $\mu \pm 1\sigma$ |
68% |
| $\mu \pm 2\sigma$ |
95% |
| $\mu \pm 3\sigma$ |
99.7% |
π‘ Central Limit Theorem: When $n \geq 30$, sample means follow $\bar{X}
\sim N\!\left(\mu, \frac{\sigma^2}{n}\right)$ regardless of the population distribution.
Binomial Distribution
$n$ independent trials, each with success probability $p$
PMF$$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$$
$E[X] = np$ $Var(X) = np(1-p)$
Poisson Distribution
Counting rare events per unit time/area ($\lambda$ = average rate)
PMF$$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$$
$E[X] = \lambda$ $Var(X) = \lambda$
π3. Z-Score
Measures how many standard deviations a value is from the mean. Enables comparison across different
scales.
Single Value$$z = \frac{x - \mu}{\sigma}$$
Sample Mean$$z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$$
Critical Z Values
| $z$ |
One-tailed $P$ |
Two-tailed $P$ |
| $1.645$ |
$0.050$ |
$0.100$ |
| $1.960$ |
$0.025$ |
$0.050$ |
| $2.576$ |
$0.005$ |
$0.010$ |
Z to probability: $P(Z < z)=\Phi(z)$ | Probability to Z: $z
= \Phi^{-1}(p)$
π―4. Confidence Intervals
Ο known or n β₯ 30$$\text{CI} = \bar{x} \pm z^* \cdot
\frac{\sigma}{\sqrt{n}}$$
Ο unknown and n < 30$$\text{CI} = \bar{x} \pm t^* \cdot
\frac{s}{\sqrt{n}}$$
| Confidence Level |
$z^*$ |
| 90% |
1.645 |
| 95% |
1.960 |
| 99% |
2.576 |
Proportion CI$$\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
π‘ Interpretation: "95% CI" means if we repeat this procedure infinitely,
95% of the constructed intervals would contain the true parameter.
βοΈ5. Hypothesis Testing
Steps
- $H_0$ (Null): No effect / no difference
- $H_1$ (Alternative): Effect exists / difference exists
- Set significance level: $\alpha = 0.05$
- Compute test statistic ($z$, $t$, $\chi^2$, $F$β¦)
- Find p-value
- Decision: $p < \alpha \Rightarrow$ Reject $H_0$ | $p \geq \alpha
\Rightarrow$ Fail to reject
Error Types
|
$H_0$ True |
$H_0$ False |
| Reject $H_0$ |
β Type I ($\alpha$) |
β
Correct (Power $= 1-\beta$) |
| Fail to Reject |
β
Correct |
β Type II ($\beta$) |
Test Directions
| Direction |
$H_1$ |
When? |
| Two-tailed |
$\mu \neq \mu_0$ |
Direction doesn't matter |
| Right-tailed |
$\mu > \mu_0$ |
Expecting increase |
| Left-tailed |
$\mu < \mu_0$ |
Expecting decrease |
π6. Normality Tests
Shapiro-Wilk Test
Most reliable normality test ($n < 5000$)
Test Statistic$$W = \frac{\left(\sum_{i=1}^{n} a_i
x_{(i)}\right)^2}{\sum_{i=1}^{n}(x_i - \bar{x})^2}$$
- $H_0$: Data comes from a normal distribution
- $H_1$: Data is not normally distributed
- $p > 0.05 \Rightarrow$ Assume normality β
D'Agostino KΒ² Test
Tests skewness and kurtosis jointly. More suitable for larger samples.
QQ-Plot (Visual)
Compares data quantiles against theoretical normal quantiles. Points on the line β normal.
π‘ In practice: When $n > 30$, parametric tests are generally
robust to mild normality violations (CLT).
βοΈ7. Homogeneity of Variance β Levene's Test
Tests whether groups have equal variances. Prerequisite for t-test and ANOVA.
Levene Statistic$$W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^{k}
n_i (\bar{Z}_{i\cdot} - \bar{Z}_{\cdot\cdot})^2}{\sum_{i=1}^{k}\sum_{j=1}^{n_i}(Z_{ij} -
\bar{Z}_{i\cdot})^2}$$
$Z_{ij} = |x_{ij} - \tilde{x}_i|$
(absolute deviation from median)
- $H_0$: $\sigma_1^2 = \sigma_2^2 = \cdots = \sigma_k^2$
- $p > 0.05 \Rightarrow$ Variances are homogeneous β
| Test |
Advantage |
Disadvantage |
| Levene |
Doesn't assume normality |
Slightly less powerful |
| Bartlett |
More powerful under normality |
Sensitive to violations |
β οΈ If not homogeneous: Use Welch's t-test (equal_var=False)
or Kruskal-Wallis instead of ANOVA.
π¬8. T-Test
8.1 One-Sample T-Test
Compare a group's mean against a known value.
Formula$$t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \qquad df = n -
1$$
8.2 Independent Two-Sample T-Test
Prerequisites: β Normality β‘ Homogeneity of variance β’ Independence
Equal Variance$$t = \frac{\bar{x}_1 -
\bar{x}_2}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} \qquad s_p = \sqrt{\frac{(n_1-1)s_1^2 +
(n_2-1)s_2^2}{n_1+n_2-2}}$$
Welch's T-Test (unequal variance)$$t = \frac{\bar{x}_1 -
\bar{x}_2}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}$$
8.3 Paired T-Test
Before-after comparison of the same group. Differences: $d_i = x_{1i} - x_{2i}$
Formula$$t = \frac{\bar{d}}{s_d / \sqrt{n}} \qquad df = n - 1$$
π9. Z-Test
Large-sample ($n \geq 30$) version of t-test when $\sigma$ is known.
One-Sample$$z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$$
Two-Proportion$$z = \frac{\hat{p}_1 -
\hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}} \qquad \hat{p} = \frac{x_1
+ x_2}{n_1 + n_2}$$
| Feature |
Z-Test |
T-Test |
| $\sigma$ known? |
Yes |
No |
| Sample size |
$n \geq 30$ |
Any |
| Distribution |
$N(0,1)$ |
$t(df)$ β heavier tails |
π10. ANOVA
One-Way ANOVA
Compares means of $k$ independent groups.
- $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$
- $H_1:$ At least one mean differs
F Statistic$$F = \frac{MSB}{MSW} = \frac{SS_B / (k-1)}{SS_W /
(N-k)}$$
Sum of Squares$$SS_B = \sum_{i=1}^{k} n_i(\bar{x}_i - \bar{x})^2
\qquad SS_W = \sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij} - \bar{x}_i)^2$$
Post-hoc Tests
| Test |
Use Case |
| Tukey HSD |
All pairwise comparisons, equal samples |
| Bonferroni |
Conservative, few comparisons |
| ScheffΓ© |
Flexible, unequal samples |
π²11. Chi-Square Test
Test of Independence
Is there a relationship between two categorical variables?
Test Statistic$$\chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}}
\qquad E_{ij} = \frac{R_i \cdot C_j}{N}$$
$df = (r-1)(c-1)$
Goodness of Fit
Does the observed distribution match the expected?
Test Statistic$$\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i}
\qquad df = k - 1$$
π12. Correlation & Regression
Pearson Correlation Coefficient
Formula$$r = \frac{\sum(x_i - \bar{x})(y_i -
\bar{y})}{\sqrt{\sum(x_i-\bar{x})^2 \cdot \sum(y_i-\bar{y})^2}}$$
| $|r|$ |
Interpretation |
| 0.00 β 0.29 |
Weak |
| 0.30 β 0.69 |
Moderate |
| 0.70 β 1.00 |
Strong |
Simple Linear Regression
Model$$\hat{y} = \beta_0 + \beta_1 x$$
OLS Coefficients$$\beta_1 =
\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2} \qquad \beta_0 = \bar{y} - \beta_1\bar{x}$$
Coefficient of Determination$$R^2 = 1 - \frac{SS_{res}}{SS_{tot}} =
1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}$$
Spearman Rank Correlation
Formula$$\rho = 1 - \frac{6\sum d_i^2}{n(n^2-1)} \qquad d_i =
\text{rank}(x_i) - \text{rank}(y_i)$$
π13. Non-Parametric Tests
Used when normality is violated or data is ordinal.
| Parametric |
Non-Parametric |
Scenario |
| Independent t-test |
Mann-Whitney U |
2 independent groups |
| Paired t-test |
Wilcoxon Signed-Rank |
2 dependent groups |
| One-way ANOVA |
Kruskal-Wallis |
3+ independent groups |
Mann-Whitney U
Test Statistic$$U = n_1 n_2 + \frac{n_1(n_1+1)}{2} - R_1$$
Kruskal-Wallis
H Statistic$$H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i}
- 3(N+1)$$
π14. Effect Size
p-value answers "is there a difference?" β Effect size answers "how big is the
difference?"
Cohen's d (T-Test)
Formula$$d = \frac{\bar{x}_1 - \bar{x}_2}{s_p} \qquad s_p =
\sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}$$
| $|d|$ |
Interpretation |
| 0.2 |
Small |
| 0.5 |
Medium |
| 0.8 |
Large |
Eta-Squared (ANOVA)
Formula$$\eta^2 = \frac{SS_{between}}{SS_{total}}$$
| $\eta^2$ |
Interpretation |
| 0.01 |
Small |
| 0.06 |
Medium |
| 0.14 |
Large |
β‘15. Power Analysis
Done BEFORE the test. Determines the required sample size to detect the target effect.
4 Components (give 3 β compute 4th)
| Component |
Symbol |
Typical |
| Effect size |
$d$ |
0.2 / 0.5 / 0.8 |
| Significance |
$\alpha$ |
0.05 |
| Power |
$1-\beta$ |
0.80 |
| Sample size |
$n$ |
Computed |
Power$$\text{Power} = 1 - \beta = P(\text{Detect a true effect})$$
Power increases β when: $n$ β, $d$ β, $\alpha$ β
π§ͺ16. A/B Testing
Workflow
- State hypothesis: $H_0: p_A = p_B$
- Define success metric (conversion, CTR, revenueβ¦)
- Calculate MDE & required sample size
- Run experiment & collect data
- Apply statistical test & evaluate
Two-Proportion Z-Test
Test Statistic$$z = \frac{\hat{p}_1 -
\hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}$$
Lift$$\text{Lift} = \frac{\hat{p}_{test} -
\hat{p}_{control}}{\hat{p}_{control}} \times 100\%$$
Sample Size (approx.)$$n \approx \frac{(z_{\alpha/2} + z_\beta)^2
\cdot [p_1(1-p_1) + p_2(1-p_2)]}{(p_1 - p_2)^2}$$
Common Pitfalls
| Pitfall |
Solution |
| Peeking |
Pre-determine $n$, wait until completion |
| Multiple testing |
Bonferroni: $\alpha_{adj} = \alpha / k$ |
| Simpson's paradox |
Segment analysis |
| Novelty effect |
Run for 2+ weeks |
| Selection bias |
Proper randomization |
π§ 17. Bayesian Basics
Bayes' Theorem$$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$$
General Form$$\underbrace{P(\theta|X)}_{\text{Posterior}} =
\frac{\overbrace{P(X|\theta)}^{\text{Likelihood}} \cdot
\overbrace{P(\theta)}^{\text{Prior}}}{\underbrace{P(X)}_{\text{Evidence}}}$$
Example: Medical Test Paradox
Test accuracy 99%, disease prevalence 1%:
$$P(D|+) = \frac{0.99 \times 0.01}{0.99 \times 0.01 + 0.01 \times 0.99} = \textbf{50\%}$$
β οΈ Even a 99% accurate test can be misleading for rare conditions!
Frequentist vs Bayesian
Frequentist
- Probability = long-run frequency
- Parameter is fixed (unknown)
- Result: p-value, confidence interval
- No prior information used
Bayesian
- Probability = degree of belief
- Parameter is random variable
- Result: posterior, credible interval
- Incorporates prior knowledge
πΊοΈ18. Which Test to Use? β Decision Tree
WHAT IS YOUR DATA TYPE?
β
βββ Numerical (Continuous)
β βββ 1 Group β One-sample t-test
β βββ 2 Groups
β β βββ Independent β Normal? β Yes: Independent t | No: Mann-Whitney U
β β βββ Dependent β Normal? β Yes: Paired t | No: Wilcoxon
β βββ 3+ Groups
β βββ Independent β Normal? β Yes: ANOVA + Tukey | No: Kruskal-Wallis
β βββ Dependent β Repeated Measures ANOVA / Friedman
β
βββ Categorical (Counts)
β βββ One variable β Chi-Square Goodness of Fit
β βββ Two variables β Chi-Square Independence
β
βββ Relationship
βββ Linear? β Normal? β Pearson $r$ | Spearman $\rho$
βββ Prediction? β Regression (Simple / Multiple)
Quick Checklist
| # |
Step |
Method |
| 1 |
Identify data type |
Continuous / Categorical / Ordinal |
| 2 |
Explore distribution |
Histogram, QQ-Plot |
| 3 |
Test normality |
Shapiro-Wilk |
| 4 |
Test variance homogeneity |
Levene's test |
| 5 |
Apply the right test |
Decision tree above |
| 6 |
Calculate effect size |
Cohen's $d$, $\eta^2$ |
| 7 |
Report results |
$p$ + effect + CI |
β οΈ Golden Rule: A p-value alone is NOT enough. Always report alongside
effect size and confidence intervals!