Statistics_for_Data_Scientists

📊 Statistics for Data Scientists

Comprehensive statistics cheatsheet & formula reference
Available in Turkish 🇹🇷 and English 🇬🇧

📖 What’s Inside?

A set of interactive HTML cheatsheets and a Kaggle notebook covering all essential statistics concepts for data science — from descriptive statistics to Bayesian inference and ML evaluation metrics. Each HTML page is single, self-contained, and can be viewed directly in your browser.

🗂️ Resources

📐 Statistics Fundamentals

Resource	Language	Description	Link
Formula Reference	🇬🇧 English	LaTeX-rendered formulas, theory only — no code	View →
Cheatsheet	🇬🇧 English	Formulas with Python code examples	View →
Formül Rehberi	🇹🇷 Türkçe	LaTeX formüller, sadece teori — kod yok	Görüntüle →
Cheatsheet	🇹🇷 Türkçe	Formüller ve Python kod örnekleri	Görüntüle →
Kaggle Notebook — Stats Cheatsheet	🇬🇧 English	Statistics cheatsheet with Python examples	Open in Kaggle →

📊 ML Evaluation Metrics (NEW)

Resource	Language	Description	Link
ML Metrics Guide	🇬🇧 English	Precision, Recall, F1, ROC-AUC, Regression metrics	View →
ML Metrik Rehberi	🇹🇷 Türkçe	Precision, Recall, F1, ROC-AUC, Regresyon metrikleri	Görüntüle →
Kaggle Notebook — ML Metrics	🇬🇧 English	Interactive notebook with visualizations & runnable code	Open in Kaggle →

Formula Reference — Pure theory with beautifully typeset LaTeX math (powered by KaTeX)

Cheatsheet — Same topics + ready-to-use Python code snippets you can copy-paste

ML Metrics Guide — Covers classification & regression evaluation metrics with formulas, examples, and a decision guide

Kaggle Notebook — Hands-on notebook with scikit-learn, matplotlib & seaborn visualizations

📋 Topics Covered

📐 Statistics Fundamentals (18 Sections)

#	Topic	Key Concepts
01	Descriptive Statistics	Mean, Median, Mode, Variance, Std Dev, Skewness, Kurtosis
02	Probability Distributions	Normal, Binomial, Poisson, CLT
03	Z-Score	Standardization, Critical Values, CDF
04	Confidence Intervals	z-interval, t-interval, Proportion CI
05	Hypothesis Testing	H₀/H₁, Type I & II Errors, p-value
06	Normality Tests	Shapiro-Wilk, D’Agostino, QQ-Plot
07	Variance Homogeneity	Levene, Bartlett
08	T-Test	One-sample, Independent, Paired, Welch
09	Z-Test	Large-sample, Two-proportion
10	ANOVA	One-way, F-statistic, Tukey HSD
11	Chi-Square Test	Independence, Goodness of Fit
12	Correlation & Regression	Pearson, Spearman, OLS, R²
13	Non-Parametric Tests	Mann-Whitney U, Wilcoxon, Kruskal-Wallis
14	Effect Size	Cohen’s d, Eta-squared
15	Power Analysis	Sample size calculation
16	A/B Testing	Proportion tests, MDE, Lift, Pitfalls
17	Bayesian Basics	Bayes’ Theorem, Prior/Posterior, Medical Test Paradox
18	Decision Tree	Which test to choose? + Quick checklist

📊 ML Evaluation Metrics (11 Sections)

#	Topic	Key Concepts
01	Confusion Matrix	TP, FP, FN, TN, Type I & II Errors
02	Accuracy	Overall correctness, Accuracy Paradox
03	Precision	Positive predictive value, when FP is costly
04	Recall (Sensitivity)	True positive rate, when FN is costly
05	F1 Score	Harmonic mean, F-Beta variants (F0.5, F2)
06	Specificity	True negative rate, Sensitivity vs Specificity
07	ROC Curve & AUC	Threshold-independent model comparison
08	Log Loss	Probability calibration, cross-entropy
09	Regression Metrics	MAE, MSE, RMSE, R²
10	Metric Selection Guide	Which metric for which scenario
11	Python Implementation	scikit-learn code with visualizations

🖨️ Save as PDF

Open any of the links above in your browser
Press Ctrl + P (or Cmd + P on Mac)
Select “Save as PDF”
⚠️ Enable “Background graphics” for the dark theme colors to render properly

✨ Features

🌙 Dark premium theme — Easy on the eyes, great for late-night study sessions
📐 LaTeX formulas — Beautifully rendered with KaTeX
🐍 Python code — Copy-paste ready examples using NumPy, SciPy, statsmodels, scikit-learn
📊 Interactive visualizations — matplotlib & seaborn charts in the Kaggle notebook
🗺️ Decision trees — “Which test/metric should I use?” visual guides
🖨️ Print-optimized — Clean light theme when printing or saving as PDF
🌍 Bilingual — Full Turkish & English versions
🏆 Kaggle-ready — Upload the .ipynb directly to Kaggle

📄 License

This project is open source and available for educational purposes.

_{Made with ❤️ for data science learners}

This site is open source. Improve this page.