Fan Yao

Machine Learning

Course Overview

This course provides a mathematically rigorous introduction to the core ideas of statistical machine learning for advanced undergraduate and master’s students. It emphasizes foundational principles rather than a broad survey of algorithms, using a small number of representative models to illustrate the central ideas behind modern machine learning.

The course integrates the following key aspects:

mathematical foundations including empirical risk minimization, VC-dimension, convex optimization, regularization techniques, and Bayesian perpsectives of learning
core machine learning paradigms including supervised learning, unsupervised learning, and generative models

The goal is to help students understand why machine learning methods work, what assumptions they rely on, and how they connect to modern applications.

The course emphasizes the balance between theoretical understanding and practical implementation, training students to analyze, implement, and critically evaluate machine learning methods. By the end of the course, students should be equipped to independently learn and evaluate new ML advances and emerging AI techniques.

Learning Goals

Build a unified understanding of the main paradigms of machine learning
Understand the mathematical principles underlying core ML methods
Implement and analyze representative machine learning models
Connect foundational ideas to modern applications such as representation learning and language models

Course Structure

The course is organized around four main parts:

1. Foundations of Machine Learning

What it means to learn from data
Supervised, unsupervised, and online learning
Empirical risk minimization and generalization
Evaluation metrics and validation

2. Learning with Labels

Classification and regression
Logistic regression as a probabilistic model
Convex optimization and gradient-based methods
Regularization and the bias–variance trade-off
Support vector machines and kernel methods

3. Learning without Labels

Unsupervised learning objectives
Clustering and dimensionality reduction
Matrix factorization and representation learning
Generative models and probabilistic modeling
Basic ideas behind language models

4. Learning in the Loop

Interactive and sequential learning
Online learning and modern feedback-driven ML systems
Reinforcement-learning-inspired ideas
Fine-tuning language models with preference data

Full syllabus and course materials available on Canvas. For visitors outside UNC, course slides are available upon request.

Assignments and Evaluation

Homework (20%): Technical and computational analysis
Exams (50%): One mock midterm exam and one final exam
Final Project (30%): Group presentation and final report

Course Design

This is a theory-oriented machine learning course with a substantial practical component. Students are expected to engage in:

Mathematical analysis of machine learning models
Implementation-based homework assignments
Lab sessions on real datasets
A final team project connecting theory to modern ML practice

The course is designed to help students move beyond using ML tools mechanically, and instead understand the principles that unify a wide range of methods across modern machine learning.

Tools and Format

Programming: Python
Writing: LaTeX (Overleaf recommended)
Materials: Open-source readings, lecture slides/notes, and lab sessions

Last updated on Mar 17, 2026

No results found