🤖 Data Sciences Repository

ANLY 530 - Machine Learning | Harrisburg University

Welcome to ANLY 530 Machine Learning

This course introduces the theory and practice of machine learning using Python with industry-standard libraries. Topics progress from classical supervised and unsupervised learning through deep neural networks, covering the end-to-end ML project workflow, model evaluation, hyperparameter tuning, and modern architectures.

Primary Textbook: Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly Media.

Author: Roozbeh Sadeghian | Course: Machine Learning | Institution: Harrisburg University

View Repository on GitHub 📄 Course Syllabus (PDF)

📚 Weekly Tutorials

Week 01: Introduction to Machine Learning

Overview of the ML landscape: supervised vs. unsupervised learning, batch vs. online learning, and instance-based vs. model-based learning. Covers the end-to-end ML project workflow and key challenges.

Week 02: Decision Trees

Deep dive into the CART algorithm — Gini impurity, entropy, class probabilities, decision boundaries, regularization (cp, max_depth, minsplit), instability, regression trees, feature importance, and partial dependence plots. Includes IEEE-cited references to Géron (2019) and Boehmke & Greenwell (2020).

Week 03: Ensemble Learning & Random Forests

From bagging and voting classifiers to Random Forests, Extra-Trees, and Gradient Boosting. Covers OOB evaluation, feature importance, hyperparameter tuning (mtry, nodesize), regression forests, and stacking. Includes IEEE-cited references to Géron (2019) and Boehmke & Greenwell (2020).

Week 04: Regression Models

Linear Regression (Normal Equation, Gradient Descent, MSE cost surface, learning rate effect, confidence & prediction bands, residual diagnostics), SVR (linear vs. RBF kernel), Decision Tree Regression (depth vs. overfitting), and Random Forest Regression (OOB convergence, feature importance). 8 visualizations + model comparison.

Week 05: Support Vector Machines

Maximum-margin classification, hyperplanes, support vectors, hard vs. soft margin (C parameter), the kernel trick (linear, polynomial, RBF), gamma tuning, feature scaling, SVR for regression, multi-class SVM, and model comparison with Decision Trees & Random Forests. 20+ visualizations with plain English explanations.

Week 06: Naive Bayes Classifiers NEW

Bayes' Theorem and probabilistic classification, the naive independence assumption, Gaussian / Multinomial / Bernoulli NB variants, Laplace smoothing, spam filtering from scratch, decision boundary comparison with DT, RF & SVM, learning curves, and training speed benchmarks. 20+ visualizations with plain English explanations.

Weeks 07–15 Coming Soon

Classification metrics, unsupervised learning, preprocessing, dimensionality reduction, neural networks, model evaluation, hyperparameter tuning, and production deployment strategies.

🎯 Key Features

  • Code-First Approach: All examples are executable and reproducible in Python
  • Theory-Grounded: Concepts tied to Géron (2019) and peer-reviewed sources
  • Hands-On: Real datasets with end-to-end pipelines
  • Modern Toolchain: Scikit-Learn, Keras, TensorFlow 2.x

📖 References

  • Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly Media.
  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning (2nd ed.). Springer.
  • Boehmke, B. C., & Greenwell, B. M. (2020). Hands-On Machine Learning with R. CRC Press. https://bradleyboehmke.github.io/HOML/DT.html