Reinforcement learning 2021 2022 — различия между версиями

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск
Строка 35: Строка 35:
 
== Projects ==
 
== Projects ==
  
== Recommended literature (1st term) ==
+
== Recommended literature ==
*http://www.statslab.cam.ac.uk/~james/Markov/ - Cambridge lecture notes on discrete-time Markov Chains
+
'''Lecture and seminar 09.11'''
*https://link.springer.com/book/10.1007%2F978-3-319-97704-1 - book by E. Moulines et al, you are mostly interested in chapters 1,2,7 and 9 (book is accessible for download through HSE network)
+
* Sebastien Bubek, Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Chapter 2. \url{http://sbubeck.com/SurveyBCB12.pdf}
*https://link.springer.com/book/10.1007%2F978-3-319-62226-2 - Stochastic Calculus by P. Baldi, good overview of conditional probabilities and expectations (part 4, also accessible through HSE network)
+
* Richard S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. Chapter~$2$. \url{http://incompleteideas.net/book/the-book-2nd.html};
*https://link.springer.com/book/10.1007%2F978-1-4419-9634-3 - Probability for Statistics and Machine Learning by A. Dasgupta, chapter 19 (MCMC), also accessible through HSE network
+

Версия 23:19, 9 ноября 2021

Lecturers and Seminarists

Lecturer Naumov Alexey [anaumov@hse.ru] T924
Lecturer Denis Belomestny [dbelomestny@hse.ru] T924
Seminarist Samsonov Sergey [svsamsonov@hse.ru] T926
Seminarist Maxim Kaledin [mkaledin@hse.ru] T926

About the course

This page contains materials for Mathematical Foundations of Reinforcement learning course in 2021/2022 year, optional one for 2nd year Master students of the Math of Machine Learning program (HSE and Skoltech).

Grading

The final grade consists of 2 components (each is non-negative real number from 0 to 10, without any intermediate rounding) :

  • OHW for the hometasks
  • OProject for the course project

The formula for the final grade is

  • OFinal = 0.5*OHW + 0.5*OProject

with the usual (arithmetical) rounding rule.

Table with grades

Lectures

Seminars

Homeworks

Projects

Recommended literature

Lecture and seminar 09.11