STA 250 - Theoretical Foundations for Machine Learning, Spring 2025

Course Information

Instructor: Dogyoon Song (dgsong [at] ucdavis [dot] edu)

Lectures: Mondays and Wednesdays, 2:10 PM - 3:50 PM, Wellman Hall 109

Office hours: Mondays, 4:00 PM - 5:00 PM, or by appointment (MSB 4220)

Syllabus: link.

Canvas: link.

Piazza: link.

Texts and resources: We will have no official course textbooks, but you may find the following resources useful.

Francis Bach, “Learning Theory from First Principles,” MIT Press
Tengyu Ma, Machine Learning Theory, Stanford Lecture Notes
Percy Liang, Statistical Learning Theory, Stanford Lecture Notes
Stephen Boyd, “Convex Optimization,” Cambridge University Press
Roman Vershynin, “High-Dimensional Probability,” Cambridge University Press (also available here)
Martin Wainwright, “High-Dimensional Statistics,” Cambridge University Press
Sanjeev Arora, “Theory of Deep Learning,” Book Draft
Matus Telgarsky, Deep Learning Theory, Lecture Notes – new version and old version

Topics

We will cover the following subjects, which may change due to time constraints or student interests.

Intro to supervised learning and generalization theory
Optimization theory and methods
Deep learning theory
Brief survey of additional topics (if time permits)
- Causal machine learning
- Deep generative models
- …

Course Structure and Evaluation

The first ~7 weeks of the course will consist of lectures, in-class quizzes, and in-class reading group discussions. The following ~2 weeks will survey additional topics, and the last week of the coursae will be for presenting final projects.

The students’ performance in this course will be evaluated based on the following:

Homework: 30%
Reading group discussions: 25%
Term project: 45%

Homework

There will be three homework assignments (not including “Homework 0,” which you need not submit). “Homework 0” is meant for self-assessment; no solutions will be provided. If you struggle with completing any of the non-programming part of “Homework 0,” then please note the course may require significant extra effort on your part.

Each of the three graded homeworks will align with each of the first three core units, offering essential practice for internalizing the concepts and methodologies covered in class and for exploring material beyond lectures. A random subset of problems will be graded, and clarity of writing counts alongside correctness.

You are welcome to collaborate with other students on your homework, but you must list the names of any collaborators at the top of your homework assignment. All final write-ups must be done individually, and submissions must be in {\LaTeX}-produced PDFs via Gradescope (accessible through Canvas).

If you are unfamiliar with software packages like PyTorch/Jax/TensorFlow, you may find the following tutorials helpful:

All of the coding you will need to do for the course (and tutorials above) can be done on your personal laptop or in Google Colab. A GPU will not be needed, although using the Google Colab GPU may help if you want to do an experiment-heavy project.

Reading group discussions

We will hold approximately three sessions to read and discuss an influential paper related to the course. The format follows a role-based discussion inspired by Colin Raffel and Alec Jacobson’s role-playing student seminars, which is also used in Aditi Raghunathan’s course. For more details on the reading group, please see this page.

Project

Students will read 2 or more papers on a topic of their choice, then identify and formulate an interesting, concrete follow-up question for future research. Students are expected to make initial strides in solving these questions, via theory or experiments. For more details about the project, please see this page.

Tentative Class Schedule

Lecture Day	Topics	Lecture notes	Additional references	HW
Mon, Mar 31	Introduction	Lecture 1	Bach, Ch 2 & Ma, Ch 1	“Homework 0”
Wed, Apr 2	Empirical risk minimization	Lecture 2	Ma, Ch 2, 4.1-4.3 & Bach, Ch 4.1-4.4
Mon, Apr 7	Rademacher complexity	Lecture 3	Ma, Ch 4.4-4.6, 5.1-5.2 & Bach, Ch 4.5	Homework 1 released
Wed, Apr 9	Kernel methods	Lecture 4	Liang, Ch 4 & Bach, Ch 7
Mon, Apr 14	Convex optimization		Boyd, Ch 2-4 & 9.1-9.3
Wed, Apr 16	Descent methods, pre-conditioning, acceleration
Mon, Apr 21	Paper discussion 1		Homework 2 released
Wed, Apr 23
Mon, Apr 28
Wed, Apr 30	Paper discussion 2
Mon, May 5
Wed, May 7
Mon, May 12
Wed, May 14
Mon, May 19
Wed, May 21	Paper discussion 3
Mon, May 26	Memorial day, no class
Wed, May 28
Mon, Jun 2	(if needed) Final project presentation
Wed, Jun 4	Final project presentation