RL 2023 — различия между версиями

Текущая версия на 19:19, 25 ноября 2023

Содержание

1 Lecturers and Seminarists
2 About the course
3 Grading
4 Course materials
5 Homeworks
6 Recommended literature
7 Homeworks
8 Projects

Lecturers and Seminarists

Lecturer	Alexey Naumov	[anaumov@hse.ru]	T924
Seminarist	Ilya Levin	[tg: @levensons]	T926

About the course

This page contains materials for Mathematical Foundations of Reinforcement learning course in 2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).

Grading

The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :

O_HW for the hometasks
O_Exam for the exam

The formula for the final grade is

O_Final = 0.6*O_HW + 0.4*O_Exam

with the usual (arithmetical) rounding rule.

Course materials

Homeworks

HW 1

Recommended literature

Sebastien Bubek, Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Chapter 2. http://sbubeck.com/SurveyBCB12.pdf
Richard S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. Chapter 2. http://incompleteideas.net/book/the-book-2nd.html;
Botao Hao et al. Bootstrapping Upper Confidence Bound. https://arxiv.org/abs/1906.05247
Aleksandrs Slivkins. Introduction to Multi-Armed Bandits. https://arxiv.org/abs/1904.07272 [Chapter 1]

@@ Строка 10: / Строка 10: @@
 == About the course ==
-This page contains materials for Mathematical Foundations of Reinforcement learning course in 2022/2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).
+This page contains materials for Mathematical Foundations of Reinforcement learning course in 2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).
 == Grading ==
 The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :
 * O<sub>HW</sub> for the hometasks
-* O<sub>Project</sub> for the course project
+* O<sub>Exam</sub> for the exam
 The formula for the final grade is
-* O<sub>Final</sub> = 0.6*O<sub>HW</sub> + 0.4*O<sub>Project</sub>
+* O<sub>Final</sub> = 0.6*O<sub>HW</sub> + 0.4*O<sub>Exam</sub>
 with the usual (arithmetical) rounding rule.
@@ Строка 23: / Строка 23: @@
 *[https://www.overleaf.com/read/kbzmvxdzbrxq '''Lectures and seminars notes''']
 *[https://colab.research.google.com/drive/10qBq7Ot_1ZpnTeD11P5AnE8jFVj0OLXl?usp=sharing '''Notebook for the first seminar''']
+== Homeworks ==
+*[https://disk.yandex.ru/i/C8hwvvS5us09sA '''HW 1''']
 == Recommended literature ==

RL 2023 — различия между версиями

Текущая версия на 19:19, 25 ноября 2023

Содержание

Lecturers and Seminarists

About the course

Grading

Course materials

Homeworks

Recommended literature

Homeworks

Projects

Навигация

Персональные инструменты

Пространства имён

Варианты

Просмотры

Действия

Поиск

Навигация

Инструменты