RL 2023 — различия между версиями
Материал из Wiki - Факультет компьютерных наук
Ivlevin (обсуждение | вклад) |
Ivlevin (обсуждение | вклад) |
||
Строка 5: | Строка 5: | ||
|| Lecturer || [https://www.hse.ru/staff/anaumov Alexey Naumov] || [anaumov@hse.ru] || T924 | || Lecturer || [https://www.hse.ru/staff/anaumov Alexey Naumov] || [anaumov@hse.ru] || T924 | ||
|- | |- | ||
− | || Seminarist || | + | || Seminarist || Ilya Levin || [tg: @levensons] || T926 |
|- | |- | ||
|} | |} |
Версия 12:35, 13 ноября 2023
Содержание
Lecturers and Seminarists
Lecturer | Alexey Naumov | [anaumov@hse.ru] | T924 |
Seminarist | Ilya Levin | [tg: @levensons] | T926 |
About the course
This page contains materials for Mathematical Foundations of Reinforcement learning course in 2022/2023 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).
Grading
The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :
- OHW for the hometasks
- OProject for the course project
The formula for the final grade is
- OFinal = 0.6*OHW + 0.4*OProject
with the usual (arithmetical) rounding rule.
Course materials
Recommended literature
- Sebastien Bubek, Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Chapter 2. http://sbubeck.com/SurveyBCB12.pdf
- Richard S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. Chapter 2. http://incompleteideas.net/book/the-book-2nd.html;
- Botao Hao et al. Bootstrapping Upper Confidence Bound. https://arxiv.org/abs/1906.05247
- Aleksandrs Slivkins. Introduction to Multi-Armed Bandits. https://arxiv.org/abs/1904.07272 [Chapter 1]