Dse 2023-24 — различия между версиями
Bdemeshev (обсуждение | вклад) (Новая страница: «== General course info == 16 lectures plus 16 classes * Boring official web page Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final Each…») |
Pshuanar (обсуждение | вклад) (→Log Book or Tentative Plan) |
||
(не показано 19 промежуточных версии 3 участников) | |||
Строка 2: | Строка 2: | ||
16 lectures plus 16 classes | 16 lectures plus 16 classes | ||
− | |||
− | |||
Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final | Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final | ||
− | Each small HA consists of approximately 4 or 5 problems. | + | Each small HA consists of approximately 4 or 5 problems. Group project may be written by a group of 1-3 students. Midterm and Final are offline and hand written. |
Lecturer: Boris Demeshev | Lecturer: Boris Demeshev | ||
− | Class teachers: Yana Khassan, Shuana Pirbudagova | + | Class teachers: [https://www.hse.ru/org/persons/190922066 Yana Khassan], Shuana Pirbudagova |
+ | |||
+ | [https://t.me/+wieDS2ZjAvVjYTJi tg group], | ||
+ | |||
+ | Lectures: Monday, 18:10 - 19:30 Moscow time, [https://zoom.us/j/8126338383 zoom] | ||
+ | |||
+ | Classes: | ||
+ | * Thursday, 13:00 - 14:20 Moscow time, D208, Shuana | ||
+ | |||
+ | Github repository of the class: [https://github.com/Shuaynat/DSE-23-24/tree/main] | ||
==Log Book or Tentative Plan == | ==Log Book or Tentative Plan == | ||
+ | [https://www.youtube.com/playlist?list=PLyjahhN4Wdd9rpesSCbxg5YdtF_cJ1SZC 🔗Lecture playlist] | ||
− | '''Week 1. 2023-09-04''' Entropy | + | [https://www.youtube.com/playlist?list=PLyjahhN4Wdd99xaZOCZ99hNcLiM66QR9E 🔗Consultations playlist] |
+ | |||
+ | '''Week 1. 2023-09-04''': Entropy, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L01.pdf pdf] | ||
Guessing game, conditional entropy, joint entropy. | Guessing game, conditional entropy, joint entropy. | ||
Строка 24: | Строка 34: | ||
More: | More: | ||
− | + | Cristopher Olah, Visual Information Theory | |
+ | https://colah.github.io/posts/2015-09-Visual-Information/ | ||
− | Week 2. Kelly criterion | + | Grand Sanderson, Solving Wordle using information theory |
+ | https://www.youtube.com/watch?v=v68zYyaEmEA | ||
+ | |||
+ | конспект аналогичной лекции на фкн: | ||
+ | https://exuberant-arthropod-be8.notion.site/1-02-09-5e107ea1c4054594b8f37d955db8a2b0 | ||
+ | |||
+ | '''Week 2.''': Kelly criterion, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L02.pdf pdf] | ||
+ | |||
+ | How to calculate expected values using cross-entropy ideas, H(X) - H(X|Q) as long term interest rate. | ||
+ | |||
+ | More: | ||
+ | |||
+ | [https://en.wikipedia.org/wiki/Kelly_criterion Kelly criterion] | ||
Class: group by, reshape and join | Class: group by, reshape and join | ||
− | Week 3. Trees | + | '''Week 3.''': Trees, [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L03.pdf pdf] |
Class: Trees (regression + classification) + tree visualization | Class: Trees (regression + classification) + tree visualization | ||
− | Week 4. Random forest | + | More: |
+ | |||
+ | [http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ tree visualization by r2d3] | ||
+ | |||
+ | [https://en.wikipedia.org/wiki/Receiver_operating_characteristic accuracy, recall, roc and all of that] | ||
+ | |||
+ | '''Week 4.''' Random forest [https://github.com/Shuaynat/DSE-23-24/raw/main/03-lectures/Dse2023-L04.pdf pdf] | ||
+ | |||
+ | More: | ||
+ | |||
+ | [http://www.r2d3.us/visual-intro-to-machine-learning-part-2/ bias-variance trade-off visualization for trees] | ||
+ | |||
+ | [https://arxiv.org/pdf/1411.5279.pdf Tim Hesterberg, What teachers should know about bootstrap?] Very well written text, for permutation test see sections 2.1 and 7. | ||
Class: Random forest, cross-validation in sklearn, feature importance, | Class: Random forest, cross-validation in sklearn, feature importance, | ||
− | Week 5. Gradient boosting | + | '''Week 5.''': Gradient boosting, Data splitting strategies |
Class: XGBoost vs LightGBM, Dummy variables, categorical variables and Catboost | Class: XGBoost vs LightGBM, Dummy variables, categorical variables and Catboost | ||
− | Week 6. Naive bootstrap, t-stat bootstrap, permutation tests | + | '''Week 6.''': Naive bootstrap, t-stat bootstrap, permutation tests |
Class: Hypothesis testing | Class: Hypothesis testing | ||
Строка 50: | Строка 85: | ||
https://arch.readthedocs.io/en/latest/bootstrap/bootstrap.html | https://arch.readthedocs.io/en/latest/bootstrap/bootstrap.html | ||
− | Week 7. Matrices in regression | + | '''Week 7.''': Matrices in regression |
Class: (by hand) Differential in matrix form, derivation of formulas for beta. | Class: (by hand) Differential in matrix form, derivation of formulas for beta. | ||
− | Here will be | + | '''Here will be <del>dragons</del> midterm!''' |
− | Week 8. SVD = PCA | + | '''Week 8.''': SVD = PCA |
Class: (by hand) Covariance matrices, | Class: (by hand) Covariance matrices, | ||
− | Week 9. James Stein paradox | + | '''Week 9.''': James Stein paradox |
Class: Matrices in numpy, PCA in sklearn, SVD | Class: Matrices in numpy, PCA in sklearn, SVD | ||
− | Week 10. L1, L2 regularization | + | '''Week 10.''': L1, L2 regularization |
Class: Regression in sklearn, different type of regularisation | Class: Regression in sklearn, different type of regularisation | ||
− | Week 11. Log regression + L1/L2 | + | '''Week 11.''': Log regression + L1/L2 |
Class: Log regression (sklearn/statsmodels) + L1/L2 | Class: Log regression (sklearn/statsmodels) + L1/L2 | ||
− | Week 12. Hierarchical clustering + k-means | + | '''Week 12.''': Hierarchical clustering + k-means |
Class: Hierarchical clustering + k-means | Class: Hierarchical clustering + k-means | ||
− | Week 13. ETS (Exponential Smoothing) | + | '''Week 13.''': ETS (Exponential Smoothing) |
Class: Plotting time series, ETS (sktime) | Class: Plotting time series, ETS (sktime) | ||
Строка 86: | Строка 121: | ||
https://www.sktime.net/en/stable/examples/01_forecasting.html | https://www.sktime.net/en/stable/examples/01_forecasting.html | ||
− | Week 14. Bayesian approach | + | '''Week 14.''': Bayesian approach |
Class: TS forecasting with grad boosting | Class: TS forecasting with grad boosting | ||
− | Week 15. Mention of MCMC + DLT | + | '''Week 15.''': Mention of MCMC + DLT |
Class: DLT in python | Class: DLT in python | ||
Строка 100: | Строка 135: | ||
https://www.uber.com/blog/orbit/ | https://www.uber.com/blog/orbit/ | ||
− | Week 16.QA | + | '''Week 16.''': QA |
Class: QA | Class: QA | ||
− | Here will be | + | Here will be <del>dragons</del> final! |
Текущая версия на 16:12, 31 октября 2023
General course info
16 lectures plus 16 classes
Fall grade = 0.2 Small HAs + 0.2 Group project + 0.3 Midterm + 0.3 Final
Each small HA consists of approximately 4 or 5 problems. Group project may be written by a group of 1-3 students. Midterm and Final are offline and hand written.
Lecturer: Boris Demeshev
Class teachers: Yana Khassan, Shuana Pirbudagova
Lectures: Monday, 18:10 - 19:30 Moscow time, zoom
Classes:
- Thursday, 13:00 - 14:20 Moscow time, D208, Shuana
Github repository of the class: [1]
Log Book or Tentative Plan
Week 1. 2023-09-04: Entropy, pdf
Guessing game, conditional entropy, joint entropy.
Class: data manipulation, data vizualization
More:
Cristopher Olah, Visual Information Theory https://colah.github.io/posts/2015-09-Visual-Information/
Grand Sanderson, Solving Wordle using information theory https://www.youtube.com/watch?v=v68zYyaEmEA
конспект аналогичной лекции на фкн: https://exuberant-arthropod-be8.notion.site/1-02-09-5e107ea1c4054594b8f37d955db8a2b0
Week 2.: Kelly criterion, pdf
How to calculate expected values using cross-entropy ideas, H(X) - H(X|Q) as long term interest rate.
More:
Class: group by, reshape and join
Week 3.: Trees, pdf
Class: Trees (regression + classification) + tree visualization
More:
accuracy, recall, roc and all of that
Week 4. Random forest pdf
More:
bias-variance trade-off visualization for trees
Tim Hesterberg, What teachers should know about bootstrap? Very well written text, for permutation test see sections 2.1 and 7.
Class: Random forest, cross-validation in sklearn, feature importance,
Week 5.: Gradient boosting, Data splitting strategies
Class: XGBoost vs LightGBM, Dummy variables, categorical variables and Catboost
Week 6.: Naive bootstrap, t-stat bootstrap, permutation tests
Class: Hypothesis testing
More:
https://arch.readthedocs.io/en/latest/bootstrap/bootstrap.html
Week 7.: Matrices in regression
Class: (by hand) Differential in matrix form, derivation of formulas for beta.
Here will be dragons midterm!
Week 8.: SVD = PCA
Class: (by hand) Covariance matrices,
Week 9.: James Stein paradox
Class: Matrices in numpy, PCA in sklearn, SVD
Week 10.: L1, L2 regularization
Class: Regression in sklearn, different type of regularisation
Week 11.: Log regression + L1/L2
Class: Log regression (sklearn/statsmodels) + L1/L2
Week 12.: Hierarchical clustering + k-means
Class: Hierarchical clustering + k-means
Week 13.: ETS (Exponential Smoothing)
Class: Plotting time series, ETS (sktime)
More:
https://www.sktime.net/en/stable/examples/01_forecasting.html
Week 14.: Bayesian approach
Class: TS forecasting with grad boosting
Week 15.: Mention of MCMC + DLT
Class: DLT in python
More:
Mcmc visualization: https://chi-feng.github.io/mcmc-demo/app.html?algorithm=SVGD&target=banana&delay=0
https://www.uber.com/blog/orbit/
Week 16.: QA
Class: QA
Here will be dragons final!