RL 2023 — различия между версиями
Материал из Wiki - Факультет компьютерных наук
Ivlevin (обсуждение | вклад) |
Ivlevin (обсуждение | вклад) |
||
(не показаны 4 промежуточные версии этого же участника) | |||
Строка 5: | Строка 5: | ||
|| Lecturer || [https://www.hse.ru/staff/anaumov Alexey Naumov] || [anaumov@hse.ru] || T924 | || Lecturer || [https://www.hse.ru/staff/anaumov Alexey Naumov] || [anaumov@hse.ru] || T924 | ||
|- | |- | ||
− | || Seminarist || | + | || Seminarist || Ilya Levin || [tg: @levensons] || T926 |
|- | |- | ||
|} | |} | ||
== About the course == | == About the course == | ||
− | This page contains materials for Mathematical Foundations of Reinforcement learning course in | + | This page contains materials for Mathematical Foundations of Reinforcement learning course in 2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech). |
== Grading == | == Grading == | ||
The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) : | The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) : | ||
* O<sub>HW</sub> for the hometasks | * O<sub>HW</sub> for the hometasks | ||
− | * O<sub> | + | * O<sub>Exam</sub> for the exam |
The formula for the final grade is | The formula for the final grade is | ||
− | * O<sub>Final</sub> = 0.6*O<sub>HW</sub> + 0.4*O<sub> | + | * O<sub>Final</sub> = 0.6*O<sub>HW</sub> + 0.4*O<sub>Exam</sub> |
with the usual (arithmetical) rounding rule. | with the usual (arithmetical) rounding rule. | ||
Строка 23: | Строка 23: | ||
*[https://www.overleaf.com/read/kbzmvxdzbrxq '''Lectures and seminars notes'''] | *[https://www.overleaf.com/read/kbzmvxdzbrxq '''Lectures and seminars notes'''] | ||
*[https://colab.research.google.com/drive/10qBq7Ot_1ZpnTeD11P5AnE8jFVj0OLXl?usp=sharing '''Notebook for the first seminar'''] | *[https://colab.research.google.com/drive/10qBq7Ot_1ZpnTeD11P5AnE8jFVj0OLXl?usp=sharing '''Notebook for the first seminar'''] | ||
+ | |||
+ | == Homeworks == | ||
+ | *[https://disk.yandex.ru/i/C8hwvvS5us09sA '''HW 1'''] | ||
+ | |||
== Recommended literature == | == Recommended literature == |
Текущая версия на 19:19, 25 ноября 2023
Содержание
Lecturers and Seminarists
Lecturer | Alexey Naumov | [anaumov@hse.ru] | T924 |
Seminarist | Ilya Levin | [tg: @levensons] | T926 |
About the course
This page contains materials for Mathematical Foundations of Reinforcement learning course in 2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).
Grading
The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :
- OHW for the hometasks
- OExam for the exam
The formula for the final grade is
- OFinal = 0.6*OHW + 0.4*OExam
with the usual (arithmetical) rounding rule.
Course materials
Homeworks
Recommended literature
- Sebastien Bubek, Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Chapter 2. http://sbubeck.com/SurveyBCB12.pdf
- Richard S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. Chapter 2. http://incompleteideas.net/book/the-book-2nd.html;
- Botao Hao et al. Bootstrapping Upper Confidence Bound. https://arxiv.org/abs/1906.05247
- Aleksandrs Slivkins. Introduction to Multi-Armed Bandits. https://arxiv.org/abs/1904.07272 [Chapter 1]