Statistical learning theory 2025 — различия между версиями

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск
Строка 35: Строка 35:
 
|| [https://www.dropbox.com/scl/fi/kswtqmyxw3pv336g1vdd6/01sol.pdf?rlkey=bpwnrcsj6ru3nbo4xwq2lp6g0&st=hftnu87m&dl=0 sol01]
 
|| [https://www.dropbox.com/scl/fi/kswtqmyxw3pv336g1vdd6/01sol.pdf?rlkey=bpwnrcsj6ru3nbo4xwq2lp6g0&st=hftnu87m&dl=0 sol01]
 
|-
 
|-
| [https://www.youtube.com/watch?v=gQm1G3Ep-5s ?? Sep]
+
| [https://www.youtube.com/watch?v=gQm1G3Ep-5s 23 Sep]
 
|| The standard optimal algorithm. The perceptron algorithm.  
 
|| The standard optimal algorithm. The perceptron algorithm.  
 
|| [https://www.dropbox.com/s/sy959ee81mov5cr/02slides.pdf?dl=0 sl02]  
 
|| [https://www.dropbox.com/s/sy959ee81mov5cr/02slides.pdf?dl=0 sl02]  
Строка 42: Строка 42:
 
|| <!-- [https://www.dropbox.com/scl/fi/d2wuka77bu18j9plivwl5/02sol.pdf?rlkey=yp2eprgxpc7r2antyidjd8qiw&dl=0 sol02] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/d2wuka77bu18j9plivwl5/02sol.pdf?rlkey=yp2eprgxpc7r2antyidjd8qiw&dl=0 sol02] -->
 
|-
 
|-
| [https://www.youtube.com/watch?v=Fk1-QI9PRAI ?? Oct]
+
| [https://www.youtube.com/watch?v=Fk1-QI9PRAI 30 Sep]
 
|| Kernel perceptron algorithm. Prediction with expert advice. Recap probability theory (seminar).  
 
|| Kernel perceptron algorithm. Prediction with expert advice. Recap probability theory (seminar).  
 
|| [https://www.dropbox.com/s/a60p9b76cxusgqy/03slides.pdf?dl=0 sl03]
 
|| [https://www.dropbox.com/s/a60p9b76cxusgqy/03slides.pdf?dl=0 sl03]
Строка 52: Строка 52:
 
|| ''Part 2. Distribution independent risk bounds''  
 
|| ''Part 2. Distribution independent risk bounds''  
 
|-
 
|-
| [https://www.youtube.com/watch?v=ycfYXvmKF0I ?? Oct]
+
| [https://www.youtube.com/watch?v=ycfYXvmKF0I 07 Oct]
 
|| Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes.  
 
|| Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes.  
 
|| [https://www.dropbox.com/s/pi0f3wab1xna6d7/04slides.pdf?dl=0 sl04]
 
|| [https://www.dropbox.com/s/pi0f3wab1xna6d7/04slides.pdf?dl=0 sl04]
Строка 59: Строка 59:
 
|| <!-- [https://www.dropbox.com/scl/fi/g6j0n39zhm1he8kfena8d/04sol.pdf?rlkey=hcg1cr6s4cca9ekqua67ehlhf&st=81bpsm1a&dl=0 sol04] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/g6j0n39zhm1he8kfena8d/04sol.pdf?rlkey=hcg1cr6s4cca9ekqua67ehlhf&st=81bpsm1a&dl=0 sol04] -->
 
|-  
 
|-  
| [https://www.youtube.com/watch?v=8J5B9CCy-ws ?? Oct]
+
| [https://www.youtube.com/watch?v=8J5B9CCy-ws 14 Oct]
 
|| Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions  
 
|| Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions  
 
|| [https://www.dropbox.com/s/rpnh6288rdb3j8m/05slides.pdf?dl=0 sl05]
 
|| [https://www.dropbox.com/s/rpnh6288rdb3j8m/05slides.pdf?dl=0 sl05]
Строка 66: Строка 66:
 
|| <!-- [https://www.dropbox.com/scl/fi/jzm82hqbnzp7931gz8jd2/05sol.pdf?rlkey=o04gco2huwqo4m7rrtp0yd9gl&st=6f0uh0q4&dl=0 sol05] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/jzm82hqbnzp7931gz8jd2/05sol.pdf?rlkey=o04gco2huwqo4m7rrtp0yd9gl&st=6f0uh0q4&dl=0 sol05] -->
 
|-
 
|-
| [https://www.youtube.com/watch?v=zHau8Br_UFQ ?? Oct]
+
| [https://www.youtube.com/watch?v=zHau8Br_UFQ 21 Oct]
 
|| Risk decomposition and the fundamental theorem of statistical learning theory (previous [https://www.youtube.com/watch?v=zHau8Br_UFQ recording] covers more)
 
|| Risk decomposition and the fundamental theorem of statistical learning theory (previous [https://www.youtube.com/watch?v=zHau8Br_UFQ recording] covers more)
 
|| [https://www.dropbox.com/s/0p8r5wgjy1hlku2/06slides.pdf?dl=0 sl06]
 
|| [https://www.dropbox.com/s/0p8r5wgjy1hlku2/06slides.pdf?dl=0 sl06]
Строка 73: Строка 73:
 
|| <!-- [https://www.dropbox.com/scl/fi/w8kc0izfc12sqjyd8hfou/06sol.pdf?rlkey=a09f6yx9e0ifohus9vt2ybthd&st=09qmm3m6&dl=0 sol06] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/w8kc0izfc12sqjyd8hfou/06sol.pdf?rlkey=a09f6yx9e0ifohus9vt2ybthd&st=09qmm3m6&dl=0 sol06] -->
 
|-
 
|-
| [https://youtube.com/live/G5fglRAaXMo ?? Nov]
+
| [https://youtube.com/live/G5fglRAaXMo 04 Nov]
 
|| Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma.  
 
|| Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma.  
 
|| [https://www.dropbox.com/s/kfithyq0dgcq6h8/07slides.pdf?dl=0 sl07]
 
|| [https://www.dropbox.com/s/kfithyq0dgcq6h8/07slides.pdf?dl=0 sl07]
Строка 83: Строка 83:
 
|| ''Part 3. Margin risk bounds with applications''  
 
|| ''Part 3. Margin risk bounds with applications''  
 
|-
 
|-
| [https://www.youtube.com/watch?v=oU2AzubDXeo ?? Nov]
+
| [https://www.youtube.com/watch?v=oU2AzubDXeo 11 Nov]
 
|| Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization
 
|| Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization
 
|| [https://www.dropbox.com/s/oo1qny9busp3axn/08slides.pdf?dl=0 sl08]
 
|| [https://www.dropbox.com/s/oo1qny9busp3axn/08slides.pdf?dl=0 sl08]
Строка 90: Строка 90:
 
|| <!-- [https://www.dropbox.com/scl/fi/fcu1kbczqnxjbvtjpxst7/08sol.pdf?rlkey=irlhu14q6d12poymmc25xmh6q&st=pt7euz9i&dl=0 sol08] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/fcu1kbczqnxjbvtjpxst7/08sol.pdf?rlkey=irlhu14q6d12poymmc25xmh6q&st=pt7euz9i&dl=0 sol08] -->
 
|-
 
|-
| [https://youtube.com/live/77-rZFzX2O8 ?? Nov]
+
| [https://youtube.com/live/77-rZFzX2O8 18 Nov]
 
|| Kernels: RKHS, representer theorem, risk bounds
 
|| Kernels: RKHS, representer theorem, risk bounds
 
|| [https://www.dropbox.com/s/jst60ww8ev4ypie/09slides.pdf?dl=0 sl09]
 
|| [https://www.dropbox.com/s/jst60ww8ev4ypie/09slides.pdf?dl=0 sl09]
Строка 97: Строка 97:
 
|| <!-- [https://www.dropbox.com/scl/fi/2pxx6ctc7qv4xpvc4esla/09sol.pdf?rlkey=dg9pncbr6d294gz5me3efzrwp&st=v49ksm24&dl=0 sol09] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/2pxx6ctc7qv4xpvc4esla/09sol.pdf?rlkey=dg9pncbr6d294gz5me3efzrwp&st=v49ksm24&dl=0 sol09] -->
 
|-  
 
|-  
| [https://www.youtube.com/watch?v=OgiaWrWh_WA ?? Nov]
+
| [https://www.youtube.com/watch?v=OgiaWrWh_WA 25 Nov]
 
|| AdaBoost and the margin hypothesis
 
|| AdaBoost and the margin hypothesis
 
|| [https://www.dropbox.com/s/umum3kd9439dt42/10slides.pdf?dl=0 sl10]
 
|| [https://www.dropbox.com/s/umum3kd9439dt42/10slides.pdf?dl=0 sl10]
Строка 104: Строка 104:
 
|| <!-- [https://www.dropbox.com/scl/fi/5lbthnkjkn35y68ohmhm4/10sol.pdf?rlkey=0w0twp97ohfrlcsspnzfg0wgh&st=74hhghgd&dl=0 sol10] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/5lbthnkjkn35y68ohmhm4/10sol.pdf?rlkey=0w0twp97ohfrlcsspnzfg0wgh&st=74hhghgd&dl=0 sol10] -->
 
|-  
 
|-  
| [https://youtube.com/live/DUgksR6gOQ8 ?? Dec]
+
| [https://youtube.com/live/DUgksR6gOQ8 02 Dec]
 
|| Losses of neural nets are not locally convex. Gradient descent with stable gradients. ([https://www.youtube.com/watch?v=ygVHVW3y3wM Old recording] about Hessians)
 
|| Losses of neural nets are not locally convex. Gradient descent with stable gradients. ([https://www.youtube.com/watch?v=ygVHVW3y3wM Old recording] about Hessians)
 
||  
 
||  
Строка 111: Строка 111:
 
|| <!-- [https://www.dropbox.com/scl/fi/topptsvelhdpog2qucfpr/11sol.pdf?rlkey=ceev18140kz2ly8y8crxixf03&st=lvk4j2rz&dl=0 sol11] -->
 
|| <!-- [https://www.dropbox.com/scl/fi/topptsvelhdpog2qucfpr/11sol.pdf?rlkey=ceev18140kz2ly8y8crxixf03&st=lvk4j2rz&dl=0 sol11] -->
 
|-
 
|-
| [https://youtube.com/live/URjcCXEMPv4 ?? Dec]
+
| [https://youtube.com/live/URjcCXEMPv4 09 Dec]
 
|| Lazy training and the neural tangent kernel.   
 
|| Lazy training and the neural tangent kernel.   
 
||  
 
||  
Строка 117: Строка 117:
 
||  
 
||  
 
||
 
||
<!-- |
+
| 16 Dec
| 17 Dec
+
|| Colloquium [https://www.dropbox.com/scl/fi/e2692ns95pg0kj0m4e0wo/colloqQuest.pdf?rlkey=peey4u0dxz0vohv39a3oc67ft&st=c87t9kqu&dl=0 Rules and questions] previous year.  
|| Colloquium 9h30 - 12h30 (room D725) and 18h10 - 21h (different building Старая Басманная А-125). [https://www.dropbox.com/scl/fi/e2692ns95pg0kj0m4e0wo/colloqQuest.pdf?rlkey=peey4u0dxz0vohv39a3oc67ft&st=c87t9kqu&dl=0 Rules and questions.] Reserve in [https://docs.google.com/spreadsheets/d/17pJaioWm3Vo2aYB2J3msTyxj9NSovbZHOC_w6Cxvsj8/edit?usp=sharing shedule.] -->
+
 
|}
 
|}
  

Версия 12:25, 23 сентября 2025

General Information

Lectures: on Tuesdays 13h00 -- 14h20 in room S834 and in zoom by Bruno Bauwens

Seminars: on Tuesdays 14h20 -- 16h00 online in Zoom by Nikita Lukianenko.

Please join the telegram group The course is similar to last year.


Homeworks

Deadline every 2 weeks, before the lecture. The tasks are at the end of each problem list. (Problem lists will be updated, check the year.)

Before 3rd lecture, submit homework from problem lists 1 and 2. Before 5th lecture, from lists 3 and 4. Etc.

Use --this link-- to submit homeworks. You may submit in English or Russian, as latex or as pictures. Results are here.

Late policy: 1 homework can be submitted at most 24 late without explanations.

Course materials

Video Summary Slides Lecture notes Problem list Solutions
Part 1. Online learning
16 Sep Philosophy. The online mistake bound model. The halving and weighted majority algorithms. sl01 ch00 ch01 prob01 sol01
23 Sep The standard optimal algorithm. The perceptron algorithm. sl02 ch02 ch03 prob02
30 Sep Kernel perceptron algorithm. Prediction with expert advice. Recap probability theory (seminar). sl03 ch04 ch05 prob03
Part 2. Distribution independent risk bounds
07 Oct Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes. sl04 ch06 prob04
14 Oct Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions sl05 ch07 ch08 prob05
21 Oct Risk decomposition and the fundamental theorem of statistical learning theory (previous recording covers more) sl06 ch09 prob06
04 Nov Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma. sl07 ch10 ch11 prob07
Part 3. Margin risk bounds with applications
11 Nov Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization sl08 ch12 ch13 prob08
18 Nov Kernels: RKHS, representer theorem, risk bounds sl09 ch14 prob09
25 Nov AdaBoost and the margin hypothesis sl10 ch15 prob10
02 Dec Losses of neural nets are not locally convex. Gradient descent with stable gradients. (Old recording about Hessians) ch16 prob11
09 Dec Lazy training and the neural tangent kernel. ch17 16 Dec Colloquium Rules and questions previous year.


The lectures in October and November are based on the book: Foundations of machine learning 2nd ed, Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalker, 2018.

A gentle introduction to the materials of the first 3 lectures and an overview of probability theory, can be found in chapters 1-6 and 11-12 of the following book: Sanjeev Kulkarni and Gilbert Harman: An Elementary Introduction to Statistical Learning Theory, 2012.

Grading formula

Final grade = 0.35 * [score of homeworks] + 0.35 * [score of colloquium] + 0.3 * [score on the exam] + bonus from quizzes.

All homework questions have the same weight. Each solved extra homework task increases the score of the final exam by 1 point. At the end of the lectures there is a short quiz in which you may earn 0.1 bonus points on the final non-rounded grade.

There is no rounding except for transforming the final grade to the official grade. Arithmetic rounding is used.

Autogrades: if you only need 6/10 on the exam to have the maximal 10/10 for the course, this will be given automatically. This may happen because of extra homework questions and bonuses from quizzes.

Colloquium

Rules and questions from last year.

Date: TBA

Problems exam

Date: TBA
-- You may use handwritten notes, lecture materials from this wiki (either printed or through your PC), Mohri's book
-- You may not search on the internet or interact with other humans (e.g. by phone, forums, etc)

About questions
-- 4 questions of the difficulty of the homework. (Many homework questions were from former exams.)
-- I always ask to calculate VC dimension and to give/prove some risk bound with Rademacher complexity.


Office hours

Bruno Bauwens: TBA. Better send me an email in advance.

Nikita Lukianenko: Write in Telegram, the time is flexible