Machine Learning

Course Overview

This course provides a mathematically rigorous introduction to the core ideas of statistical machine learning for advanced undergraduate and master’s students. It emphasizes foundational principles rather than a broad survey of algorithms, using a small number of representative models to illustrate the central ideas behind modern machine learning.

The course integrates the following key aspects:

  • mathematical foundations including empirical risk minimization, VC-dimension, convex optimization, regularization techniques, and Bayesian perpsectives of learning
  • core machine learning paradigms including supervised learning, unsupervised learning, and generative models

The goal is to help students understand why machine learning methods work, what assumptions they rely on, and how they connect to modern applications.

The course emphasizes the balance between theoretical understanding and practical implementation, training students to analyze, implement, and critically evaluate machine learning methods. By the end of the course, students should be equipped to independently learn and evaluate new ML advances and emerging AI techniques.


Learning Goals

  • Build a unified understanding of the main paradigms of machine learning
  • Understand the mathematical principles underlying core ML methods
  • Implement and analyze representative machine learning models
  • Connect foundational ideas to modern applications such as representation learning and language models

Course Structure

The course is organized around four main parts:

1. Foundations of Machine Learning

  • What it means to learn from data
  • Supervised, unsupervised, and online learning
  • Empirical risk minimization and generalization
  • Evaluation metrics and validation

2. Learning with Labels

  • Classification and regression
  • Logistic regression as a probabilistic model
  • Convex optimization and gradient-based methods
  • Regularization and the bias–variance trade-off
  • Support vector machines and kernel methods

3. Learning without Labels

  • Unsupervised learning objectives
  • Clustering and dimensionality reduction
  • Matrix factorization and representation learning
  • Generative models and probabilistic modeling
  • Basic ideas behind language models

4. Learning in the Loop

  • Interactive and sequential learning
  • Online learning and modern feedback-driven ML systems
  • Reinforcement-learning-inspired ideas
  • Fine-tuning language models with preference data

Full syllabus and course materials available on Canvas. For visitors outside UNC, course slides are available upon request.


Assignments and Evaluation

  • Homework (20%): Technical and computational analysis
  • Exams (50%): One mock midterm exam and one final exam
  • Final Project (30%): Group presentation and final report

Course Design

This is a theory-oriented machine learning course with a substantial practical component. Students are expected to engage in:

  • Mathematical analysis of machine learning models
  • Implementation-based homework assignments
  • Lab sessions on real datasets
  • A final team project connecting theory to modern ML practice

The course is designed to help students move beyond using ML tools mechanically, and instead understand the principles that unify a wide range of methods across modern machine learning.


Tools and Format

  • Programming: Python
  • Writing: LaTeX (Overleaf recommended)
  • Materials: Open-source readings, lecture slides/notes, and lab sessions

docs