Dse 2023-24 — различия между версиями

Текущая версия на 16:12, 31 октября 2023

General course info

16 lectures plus 16 classes

Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final

Each small HA consists of approximately 4 or 5 problems. Group project may be written by a group of 1-3 students. Midterm and Final are offline and hand written.

Lecturer: Boris Demeshev

Class teachers: Yana Khassan, Shuana Pirbudagova

tg group,

Lectures: Monday, 18:10 - 19:30 Moscow time, zoom

Classes:

Thursday, 13:00 - 14:20 Moscow time, D208, Shuana

Github repository of the class: [1]

Log Book or Tentative Plan

🔗Lecture playlist

🔗Consultations playlist

Week 1. 2023-09-04: Entropy, pdf

Guessing game, conditional entropy, joint entropy.

Class: data manipulation, data vizualization

More:

Cristopher Olah, Visual Information Theory https://colah.github.io/posts/2015-09-Visual-Information/

Grand Sanderson, Solving Wordle using information theory https://www.youtube.com/watch?v=v68zYyaEmEA

конспект аналогичной лекции на фкн: https://exuberant-arthropod-be8.notion.site/1-02-09-5e107ea1c4054594b8f37d955db8a2b0

Week 2.: Kelly criterion, pdf

How to calculate expected values using cross-entropy ideas, H(X) - H(X|Q) as long term interest rate.

More:

Kelly criterion

Class: group by, reshape and join

Week 3.: Trees, pdf

Class: Trees (regression + classification) + tree visualization

More:

tree visualization by r2d3

accuracy, recall, roc and all of that

Week 4. Random forest pdf

More:

bias-variance trade-off visualization for trees

Tim Hesterberg, What teachers should know about bootstrap? Very well written text, for permutation test see sections 2.1 and 7.

Class: Random forest, cross-validation in sklearn, feature importance,

Week 5.: Gradient boosting, Data splitting strategies

Class: XGBoost vs LightGBM, Dummy variables, categorical variables and Catboost

Week 6.: Naive bootstrap, t-stat bootstrap, permutation tests

Class: Hypothesis testing

More:

https://arch.readthedocs.io/en/latest/bootstrap/bootstrap.html

Week 7.: Matrices in regression

Class: (by hand) Differential in matrix form, derivation of formulas for beta.

Here will be ~~dragons~~ midterm!

Week 8.: SVD = PCA

Class: (by hand) Covariance matrices,

Week 9.: James Stein paradox

Class: Matrices in numpy, PCA in sklearn, SVD

Week 10.: L1, L2 regularization

Class: Regression in sklearn, different type of regularisation

Week 11.: Log regression + L1/L2

Class: Log regression (sklearn/statsmodels) + L1/L2

Week 12.: Hierarchical clustering + k-means

Class: Hierarchical clustering + k-means

Week 13.: ETS (Exponential Smoothing)

Class: Plotting time series, ETS (sktime)

More:

https://www.sktime.net/en/stable/examples/01_forecasting.html

Week 14.: Bayesian approach

Class: TS forecasting with grad boosting

Week 15.: Mention of MCMC + DLT

Class: DLT in python

More:

Mcmc visualization: https://chi-feng.github.io/mcmc-demo/app.html?algorithm=SVGD&target=banana&delay=0

https://www.uber.com/blog/orbit/

Week 16.: QA

Class: QA

Here will be ~~dragons~~ final!

@@ Строка 2: / Строка 2: @@
 lectures plus 16 classes
-* Boring official web page
 Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final
-Each small HA consists of approximately 4 or 5 problems.
+Each small HA consists of approximately 4 or 5 problems. Group project may be written by a group of 1-3 students. Midterm and Final are offline and hand written.
 Lecturer: Boris Demeshev
-Class teachers: Yana Khassan, Shuana Pirbudagova
+Class teachers: [https://www.hse.ru/org/persons/190922066 Yana Khassan], Shuana Pirbudagova
+[https://t.me/+wieDS2ZjAvVjYTJi tg group],
+Lectures: Monday, 18:10 - 19:30 Moscow time, [https://zoom.us/j/8126338383 zoom]
+Classes:
+* Thursday, 13:00 - 14:20 Moscow time, D208, Shuana
+Github repository of the class: [https://github.com/Shuaynat/DSE-23-24/tree/main]
 ==Log Book or Tentative Plan ==
+[https://www.youtube.com/playlist?list=PLyjahhN4Wdd9rpesSCbxg5YdtF_cJ1SZC 🔗Lecture playlist]
-'''Week 1. 2023-09-04''' Entropy
+[https://www.youtube.com/playlist?list=PLyjahhN4Wdd99xaZOCZ99hNcLiM66QR9E 🔗Consultations playlist]
+'''Week 1. 2023-09-04''': Entropy, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L01.pdf pdf]
 Guessing game, conditional entropy, joint entropy.
@@ Строка 24: / Строка 34: @@
 More:
-(rus) https://exuberant-arthropod-be8.notion.site/1-02-09-5e107ea1c4054594b8f37d955db8a2b0
+Cristopher Olah, Visual Information Theory
+https://colah.github.io/posts/2015-09-Visual-Information/
-Week 2. Kelly criterion
+Grand Sanderson, Solving Wordle using information theory
+https://www.youtube.com/watch?v=v68zYyaEmEA
+конспект аналогичной лекции на фкн:
+https://exuberant-arthropod-be8.notion.site/1-02-09-5e107ea1c4054594b8f37d955db8a2b0
+'''Week 2.''': Kelly criterion, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L02.pdf pdf]
+How to calculate expected values using cross-entropy ideas, H(X) - H(X|Q) as long term interest rate.
+More:
+[https://en.wikipedia.org/wiki/Kelly_criterion Kelly criterion]
 Class: group by, reshape and join
-Week 3. Trees
+'''Week 3.''': Trees, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L03.pdf pdf]
 Class: Trees (regression + classification) + tree visualization
-Week 4. Random forest + Data splitting strategies
+More:
+[http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ tree visualization by r2d3]
+[https://en.wikipedia.org/wiki/Receiver_operating_characteristic accuracy, recall, roc and all of that]
+'''Week 4.''' Random forest [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L04.pdf pdf]
+More:
+[http://www.r2d3.us/visual-intro-to-machine-learning-part-2/ bias-variance trade-off visualization for trees]
+[https://arxiv.org/pdf/1411.5279.pdf Tim Hesterberg, What teachers should know about bootstrap?] Very well written text, for permutation test see sections 2.1 and 7.
 Class: Random forest, cross-validation in sklearn, feature importance,
-Week 5. Gradient boosting
+'''Week 5.''': Gradient boosting, Data splitting strategies
 Class: XGBoost vs LightGBM, Dummy variables, categorical variables and Catboost
-Week 6. Naive bootstrap, t-stat bootstrap, permutation tests
+'''Week 6.''': Naive bootstrap, t-stat bootstrap, permutation tests
 Class: Hypothesis testing
@@ Строка 50: / Строка 85: @@
 https://arch.readthedocs.io/en/latest/bootstrap/bootstrap.html
-Week 7. Matrices in regression
+'''Week 7.''': Matrices in regression
 Class: (by hand) Differential in matrix form, derivation of formulas for beta.
-Here will be ~~dragons~~ midterm!
+'''Here will be <del>dragons</del> midterm!'''
-Week 8. SVD = PCA
+'''Week 8.''': SVD = PCA
 Class: (by hand) Covariance matrices,
-Week 9. James Stein paradox
+'''Week 9.''': James Stein paradox
 Class: Matrices in numpy, PCA in sklearn, SVD
-Week 10. L1, L2 regularization
+'''Week 10.''': L1, L2 regularization
 Class: Regression in sklearn, different type of regularisation
-Week 11. Log regression + L1/L2
+'''Week 11.''': Log regression + L1/L2
 Class: Log regression (sklearn/statsmodels) + L1/L2
-Week 12. Hierarchical clustering + k-means
+'''Week 12.''': Hierarchical clustering + k-means
 Class: Hierarchical clustering + k-means
-Week 13. ETS (Exponential Smoothing)
+'''Week 13.''': ETS (Exponential Smoothing)
 Class: Plotting time series, ETS (sktime)
@@ Строка 86: / Строка 121: @@
 https://www.sktime.net/en/stable/examples/01_forecasting.html
-Week 14. Bayesian approach
+'''Week 14.''': Bayesian approach
 Class: TS forecasting with grad boosting
-Week 15. Mention of MCMC + DLT
+'''Week 15.''': Mention of MCMC + DLT
 Class: DLT in python
@@ Строка 100: / Строка 135: @@
 https://www.uber.com/blog/orbit/
-Week 16.QA
+'''Week 16.''': QA
 Class: QA
-Here will be ~~dragons~~ final!
+Here will be <del>dragons</del> final!

Dse 2023-24 — различия между версиями

Текущая версия на 16:12, 31 октября 2023

General course info

Log Book or Tentative Plan

Навигация

Персональные инструменты

Пространства имён

Варианты

Просмотры

Действия

Поиск

Навигация

Инструменты