Data analysis (Software Engineering) 2019 — различия между версиями

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск
(Final Exam)
 
(не показано 48 промежуточных версии 2 участников)
Строка 1: Строка 1:
 
'''[https://docs.google.com/spreadsheets/d/1qKJtHeqXeTrDMlzxWXORiaTbUBM1QMd2DJjpA1eshm8/edit?usp=sharing Scores]''' <br />
 
'''[https://docs.google.com/spreadsheets/d/1qKJtHeqXeTrDMlzxWXORiaTbUBM1QMd2DJjpA1eshm8/edit?usp=sharing Scores]''' <br />
'''[https://join.slack.com/t/hse-se-ml/shared_invite/enQtNTMxODU4MTMwNjQyLTFlM2FkMmViNGI2YTg2ZmRjMjU5ZTEyMmFlYmU2NGVjN2U2YTAzNjAwZjhlYmUwNGFjNjJmYmY5MWVjNTNmZmY Slack Invite Link]  <br />
+
'''[https://join.slack.com/t/hse-se-ml/shared_invite/enQtNTc4NzUzODIwMjI0LTlhYTQxYmQxZmI5NTE4NDY0MjdlMWNjZTJhMzdlZDUzNmJhZWYyZmRkOTY0Zjc3NDE1OWMwOWEzOTdmNTI3YmE Slack Invite Link]  <br />
 
'''Anonymous feedback form:''' [https://goo.gl/forms/xTfnM328m8ulT4FF2 here]<br />
 
'''Anonymous feedback form:''' [https://goo.gl/forms/xTfnM328m8ulT4FF2 here]<br />
 
'''[[ Data_analysis_(Software_Engineering)_2018 | Previous Course Page ]]''' <br />
 
'''[[ Data_analysis_(Software_Engineering)_2018 | Previous Course Page ]]''' <br />
Строка 22: Строка 22:
 
# Final exam
 
# Final exam
  
== Course Schedule (3rd module)==
+
== Final Exam ==
 +
Final exam will be held on the '''25th of June'''
 +
* 10:30 - 15:00 in the room 509
 +
* 15:10 - 18:00 in the room 317
 +
Please put your name in the most comfortable time slot for you [https://doodle.com/poll/42g4def3m69r88fh here].
 +
 
 +
Questions list is available [https://github.com/shestakoff/hse_se_ml/blob/master/2019/exam/exam.pdf here].
 +
 
 +
== Kaggle ==
 +
Link to competition is in slack
 +
 
 +
You should send reports before June 14 23:59 (Competition ends on the 13th of June). Reports should be sent [https://www.dropbox.com/request/YTUb8bJyVhMoTl6jVnGT here]. Try to follow the format of report template - https://github.com/shestakoff/hse_se_ml/blob/master/2019/kaggle/kaggle-report-template.ipynb
 +
 
 +
== Colloquium ==
 +
Colloquium will be held on the '''1th and 2nd of April''' during seminars and lecture
 +
 
 +
You may not use any materials during colloquium, except single A4 prepared before the exam and handwritten personally by you (from two sides). You will have 2 questions from [https://cloud.mail.ru/public/Mmar/uHxTPtWnQ '''question list'''] with 20 minutes for preparation and may receive additional questions or tasks.
 +
 
 +
We are having serious time limits, so come at your seminar or earlier seminar.
 +
 
 +
 
 +
== Course Schedule (4th module)==
 
===Seminars===
 
===Seminars===
'''Dates: Mondays (21.01, 28.01, 04.02, 11.02, 18.02, 25.02, 04.03, 11.03)'''
+
'''Dates: Mondays (01.04, 08.04, 15.04, 22.04, 13.05, 20.05, 27.05, 03.06, 10.06)'''
* Group BPI-161, 10:30-11:50, Room 507
+
* Group BPI-161, 9:00-10:30, Room 501
* Group BPI-162, 12:10-13:30, Room 311
+
* Group BPI-162, 10:30-11:50, Room 311
* Group BPI-163, 13:40-15:00, Room 435
+
* Group BPI-163, 12:10-13:30, Room 311
  
 
===Lectures===
 
===Lectures===
'''Dates: Tuesdays (15.01, 22.01, 29.01, 05.02, 12.02, 19.02, 12.03, 19.03)'''
+
'''Dates: Tuesdays (02.04, 09.04, 16.04, 23.04, 14.05, 21.05, 28.05, 04.06)'''
 
* 9:00-10:20, Room 317
 
* 9:00-10:20, Room 317
 +
04.06 - Room 402
  
[https://docs.google.com/spreadsheets/d/1pLN757-mq19G58qTs6wkdxNL9ZA_rik-eNi5M7AEFh4/edit#gid=1685685766 Complete Schedule of Software Engineering]
+
[https://docs.google.com/spreadsheets/d/1pLN757-mq19G58qTs6wkdxNL9ZA_rik-eNi5M7AEFh4/edit#gid=2055150791 Complete Schedule of Software Engineering]
  
 
== Lecture materials ==
 
== Lecture materials ==
Строка 51: Строка 73:
 
'''Lecture 5. Regularization, Linear Classification ''' <br/>
 
'''Lecture 5. Regularization, Linear Classification ''' <br/>
 
[https://shestakoff.github.io/hse_se_ml/2019/l5-linclass/lecture-linclass.slides#/ Slides] <br/>
 
[https://shestakoff.github.io/hse_se_ml/2019/l5-linclass/lecture-linclass.slides#/ Slides] <br/>
 +
 +
'''Lecture 6. Supervised Quality Measures ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l6-quality/lecture-metrics.slides#/ Slides] <br/>
 +
 +
'''Lecture 7. Support Vector Machines. Kernel Trick ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l7-svm/lecture-svm.slides#/ Slides] <br/>
 +
 +
'''Lecture 8. Feature Selection. Dimension Reduction. PCA ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l8-pca/lecture-pca.slides#/ Slides] <br/>
 +
 +
'''Lecture 9. Ensembles ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l9-ensembles/lecture-ensemble.slides#/ Slides] <br/>
 +
 +
'''Lecture 10. Boosting ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l10-boosting/lecture-boosting.slides#/ Slides] <br/>
 +
 +
'''Lecture 11. Neural Networks 1 ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l11-nn1/lecture-nn1.slides#/ Slides] <br/>
 +
 +
'''Lecture 12. Neural Networks 2 ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l12-nn2/lecture-nn2.slides#/ Slides] <br/>
 +
 +
'''Lecture 13. Introduction to NLP ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l13-nlp-intro/lecture-nlp-intro.slides#/ Slides] <br/>
 +
 +
'''Lecture 14. Clustering ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l14-clust/lecture-clust.slides#/ Slides] <br/>
 +
 +
'''Lecture 15. Recsys ''' <br/>
 +
[https://shestakoff.github.io/hse_se_ml/2019/l15-recsys-intro/lecture-recsys.slides#/ Slides] <br/>
  
 
== Seminars ==
 
== Seminars ==
Строка 61: Строка 113:
 
'''Seminar 2. Metric-based methods '''<br/>
 
'''Seminar 2. Metric-based methods '''<br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/seminar2-knn.ipynb Practice in class] <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/seminar2-knn.ipynb Practice in class] <br/>
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/knn_theory.pdf Theoretical task 1] '''Extended Answer Due Date: 06.02.2019 23:59''' <br/>
+
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/knn_theory.pdf Theoretical task 1] <br/>
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/seminar2-homework.ipynb Practical task 2], [https://www.dropbox.com/request/hauCnxFgVCUtjnV4VXdY upload link], '''Due Date: 05.02.2019 23:59''' <br/>
+
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s2-metric/seminar2-homework.ipynb Practical task 2] <br/>
  
 
'''Seminar 3. Decision Trees '''<br/>
 
'''Seminar 3. Decision Trees '''<br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/seminar3-trees.ipynb Practice in class], [https://github.com/shestakoff/hse_se_ml/tree/master/2019/s3-trees titanic.csv] <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/seminar3-trees.ipynb Practice in class], [https://github.com/shestakoff/hse_se_ml/tree/master/2019/s3-trees titanic.csv] <br/>
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/trees_theory.pdf Theoretical task 2], [https://www.dropbox.com/request/aMh1jZb1Xm3ovefyMY7h upload link] '''Due Date: 12.02.2019 23:59''' <br/>
+
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/trees_theory.pdf Theoretical task 2] <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/seminar3-homework.ipynb Practical task 3], [https://github.com/shestakoff/hse_se_ml/tree/master/2019/s3-trees data.csv], [https://www.dropbox.com/request/KYxV6H91zVqWfb73SqW4 upload link] '''Due Date: 19.02.2019 23:59''' <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s3-trees/seminar3-homework.ipynb Practical task 3], [https://github.com/shestakoff/hse_se_ml/tree/master/2019/s3-trees data.csv], [https://www.dropbox.com/request/KYxV6H91zVqWfb73SqW4 upload link] '''Due Date: 19.02.2019 23:59''' <br/>
  
Строка 72: Строка 124:
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/seminar4-linreg.ipynb Practice in class], [https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/dataset.csv dataset.csv] <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/seminar4-linreg.ipynb Practice in class], [https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/dataset.csv dataset.csv] <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/linreg_theory.pdf Theoretical task 3], [https://www.dropbox.com/request/zo7nHHiPt3qQSmxbwPZ4 upload link] '''Due Date: 24.02.2019 23:59''' <br/>
 
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s4-linreg/linreg_theory.pdf Theoretical task 3], [https://www.dropbox.com/request/zo7nHHiPt3qQSmxbwPZ4 upload link] '''Due Date: 24.02.2019 23:59''' <br/>
<br/>
 
  
 +
'''Seminar 5. Linear Classification '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s5-logreg/seminar5-logreg.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s5-logreg/seminar5-homework.ipynb Practical task 4], [https://github.com/shestakoff/hse_se_ml/blob/master/2019/s5-logreg/audit_data/audit_risk.csv audit_risk.csv], [https://www.dropbox.com/request/ZKvZFQm4RUnCLr4vxV2k upload link] '''Due Date: 10.03.2019 23:59''' <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s5-logreg/linclass_theory.pdf Theoretical task 4], [https://www.dropbox.com/request/BPCi4lU8DLGGmtHClXhp upload link] '''Due Date: 04.03.2019 23:59''' <br/>
 +
 +
'''Seminar 6. Supervised quality measures '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s6-quality/seminar6-quality.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s6-quality/metrics_svm.pdf Theoretical task 5], [https://www.dropbox.com/request/H9N6AjI13sowOWubZTdp upload link] '''Due Date: 25.03.2019 23:59''' <br/>
 +
 +
'''Seminar 8. Feature Selection. Dimension Reduction. PCA '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s8-pca/seminar8-pca.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s8-pca/seminar8-homework.ipynb Practical task 5], [https://github.com/shestakoff/hse_se_ml/blob/master/2019/s8-pca/data/voice.csv voice.csv] [https://www.dropbox.com/request/cobhQeuESjkVtzYbg4Yl upload link] '''Due Date: 21.04.2019 23:59''' <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s8-pca/pca_theory.pdf Theoretical task 6], [https://www.dropbox.com/request/ZaSfptNsUuWyB27fvsXG upload link] '''Due Date: 15.04.2019 23:59''' <br/>
 +
 +
'''Seminar 9. Ensembles  '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s9-ensembles/seminar9-ensembles.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s9-ensembles/ensembles_theory.pdf Theoretical task 7] [https://www.dropbox.com/request/WxpJZAZGx9eIo5HVMbBR upload link] '''Due Date: 25.04.2019 23:59''' <br/>
 +
 +
'''Seminar 10. Boosting  '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s10-boosting/seminar.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s10-boosting/boosting-theory.pdf Theoretical task 8] [https://www.dropbox.com/request/yDTwmOcDu3jD14CAtlx7 upload link] '''Due Date: 30.04.2019 23:59''' <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s10-boosting/seminar10-homework.ipynb Practical task 6], [https://www.dropbox.com/request/o1JBiKFMGj2Bv6vOqtJ8 upload link] '''Extended Due Date: 17.05.2019 23:59''' <br/>
 +
 +
'''Seminar 11. Neural Networks 1 '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s11-nn1/seminar11-nn1.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s11-nn1/TT9.pdf Theoretical task 9] [https://www.dropbox.com/request/jAwIMNTOaCKdN07q5qTm upload link] '''Due Date: 21.05.2019 23:59''' <br/>
 +
 +
'''Seminar 12. Neural Networks 2 '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s12-nn2/seminar12-nn2.ipynb Practice in class] <br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s12-nn2/seminar12-homework.ipynb Practical task 7] [https://www.dropbox.com/request/iehLVStpn30RchYr5nMC upload link] '''Due Date: 07.06.2019 23:59''' <br/>
 +
 +
'''Seminar 13. Intro to Kaggle and NLP '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s13-nlp-intro/seminar13-nlp-intro.ipynb Practice in class]<br/>
 +
 +
'''Seminar 14. Clustering '''<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s14-clustering/seminar14-clustering.ipynb Practice in class]<br/>
 +
[https://github.com/shestakoff/hse_se_ml/blob/master/2019/s14-clustering/clustering-theory.pdf Theoretical task 10] [https://www.dropbox.com/request/E367aSvyvHvM8vvZAzFt upload link] '''Due Date: 11.06.2019 23:59'''
 +
<br/>
  
 
'''To ease examination process for our course assistants, please, put your subgroup number in the beginning of solution filenames''' <br/>
 
'''To ease examination process for our course assistants, please, put your subgroup number in the beginning of solution filenames''' <br/>
Строка 116: Строка 205:
  
 
== Useful links ==
 
== Useful links ==
=== Machine learning ===
+
=== Machine learning, Stats, Maths ===
 
* [https://github.com/esokolov/ml-course-hse Machine learning course from Evgeny Sokolov on Github]
 
* [https://github.com/esokolov/ml-course-hse Machine learning course from Evgeny Sokolov on Github]
 
* [http://www.machinelearning.ru/wiki/index.php?title=Заглавная_страница machinelearning.ru]
 
* [http://www.machinelearning.ru/wiki/index.php?title=Заглавная_страница machinelearning.ru]
Строка 123: Строка 212:
 
* [https://anvaka.github.io/greview/hands-on-ml/1/ Some books for ML2]
 
* [https://anvaka.github.io/greview/hands-on-ml/1/ Some books for ML2]
 
* [https://mml-book.github.io/ Math for ML]
 
* [https://mml-book.github.io/ Math for ML]
* One of classic ML books. [http://web.stanford.edu/~hastie/local.ftp/Springer/ESLII_print10.pdf Elements of Statistical Learning (Trevor Hastie, Robert Tibshirani, Jerome Friedman)]
+
* One of classic ML books. [https://web.stanford.edu/~hastie/Papers/ESLII.pdf Elements of Statistical Learning (Trevor Hastie, Robert Tibshirani, Jerome Friedman)]
 +
* [http://immersivemath.com/ila/learnmore.html Linear Algebra Immersive book]
  
 
=== Python ===
 
=== Python ===

Текущая версия на 18:16, 8 июня 2019

Scores
Slack Invite Link
Anonymous feedback form: here
Previous Course Page
Course repo



Course description

In this class we consider the main problems of data mining and machine learning: classification, clustering, regression, dimensionality reduction, ranking, collaborative filtering. We will also study mathematical methods and concepts which data analysis is based on as well as formal assumptions behind them and various aspects of their implementation.

A significant attention is given to practical skills of data analysis that will be developed on seminars by studying the Python programming language and relevant libraries for scientific computing.

The knowledge of linear algebra, real analysis and probability theory is required.

The class consists of:

  1. Lectures and seminars
  2. Practical and theoretical homework assignments
  3. A machine learning competition (more information will be available later)
  4. Midterm theoretical colloquium
  5. Final exam

Final Exam

Final exam will be held on the 25th of June

  • 10:30 - 15:00 in the room 509
  • 15:10 - 18:00 in the room 317

Please put your name in the most comfortable time slot for you here.

Questions list is available here.

Kaggle

Link to competition is in slack

You should send reports before June 14 23:59 (Competition ends on the 13th of June). Reports should be sent here. Try to follow the format of report template - https://github.com/shestakoff/hse_se_ml/blob/master/2019/kaggle/kaggle-report-template.ipynb

Colloquium

Colloquium will be held on the 1th and 2nd of April during seminars and lecture

You may not use any materials during colloquium, except single A4 prepared before the exam and handwritten personally by you (from two sides). You will have 2 questions from question list with 20 minutes for preparation and may receive additional questions or tasks.

We are having serious time limits, so come at your seminar or earlier seminar.


Course Schedule (4th module)

Seminars

Dates: Mondays (01.04, 08.04, 15.04, 22.04, 13.05, 20.05, 27.05, 03.06, 10.06)

  • Group BPI-161, 9:00-10:30, Room 501
  • Group BPI-162, 10:30-11:50, Room 311
  • Group BPI-163, 12:10-13:30, Room 311

Lectures

Dates: Tuesdays (02.04, 09.04, 16.04, 23.04, 14.05, 21.05, 28.05, 04.06)

  • 9:00-10:20, Room 317

04.06 - Room 402

Complete Schedule of Software Engineering

Lecture materials

Lecture 1. Introduction to data science and machine learning
Slides

Lecture 2. Cross-validation. Metric-based models. KNN
Slides

Lecture 3. Decision Trees
Slides

Lecture 4. Linear Regression, Gradient-based optimization
Slides

Lecture 5. Regularization, Linear Classification
Slides

Lecture 6. Supervised Quality Measures
Slides

Lecture 7. Support Vector Machines. Kernel Trick
Slides

Lecture 8. Feature Selection. Dimension Reduction. PCA
Slides

Lecture 9. Ensembles
Slides

Lecture 10. Boosting
Slides

Lecture 11. Neural Networks 1
Slides

Lecture 12. Neural Networks 2
Slides

Lecture 13. Introduction to NLP
Slides

Lecture 14. Clustering
Slides

Lecture 15. Recsys
Slides

Seminars

Seminar 1. Introduction to Data Analysis in Python
Practice in class
Practical task 1, upload link, Due Date: 29.01.2019 23:59
Additional materials: 1, 2

Seminar 2. Metric-based methods
Practice in class
Theoretical task 1
Practical task 2

Seminar 3. Decision Trees
Practice in class, titanic.csv
Theoretical task 2
Practical task 3, data.csv, upload link Due Date: 19.02.2019 23:59

Seminar 4. Linear Regression
Practice in class, dataset.csv
Theoretical task 3, upload link Due Date: 24.02.2019 23:59

Seminar 5. Linear Classification
Practice in class
Practical task 4, audit_risk.csv, upload link Due Date: 10.03.2019 23:59
Theoretical task 4, upload link Due Date: 04.03.2019 23:59

Seminar 6. Supervised quality measures
Practice in class
Theoretical task 5, upload link Due Date: 25.03.2019 23:59

Seminar 8. Feature Selection. Dimension Reduction. PCA
Practice in class
Practical task 5, voice.csv upload link Due Date: 21.04.2019 23:59
Theoretical task 6, upload link Due Date: 15.04.2019 23:59

Seminar 9. Ensembles
Practice in class
Theoretical task 7 upload link Due Date: 25.04.2019 23:59

Seminar 10. Boosting
Practice in class
Theoretical task 8 upload link Due Date: 30.04.2019 23:59
Practical task 6, upload link Extended Due Date: 17.05.2019 23:59

Seminar 11. Neural Networks 1
Practice in class
Theoretical task 9 upload link Due Date: 21.05.2019 23:59

Seminar 12. Neural Networks 2
Practice in class
Practical task 7 upload link Due Date: 07.06.2019 23:59

Seminar 13. Intro to Kaggle and NLP
Practice in class

Seminar 14. Clustering
Practice in class
Theoretical task 10 upload link Due Date: 11.06.2019 23:59

To ease examination process for our course assistants, please, put your subgroup number in the beginning of solution filenames
Example: 165-1-shestakov-andrey.ipynb

Evaluation criteria

The course lasts during the 3rd and 4th modules. Knowledge of students is assessed by evaluation of their home assignments and exams. Home assignments divide into theoretical tasks and practical tasks. There are two exams during the course – after the 3rd module and after the 4th module respectively. Each of the exams evaluates theoretical knowledge and understanding of the material studied during the respective module.

Grade takes values 4,5,…10. Grades, corresponding to 1,2,3 are assumed unsatisfactory. Exact grades are calculated using the following rule:

  • score ≥ 35% => 4,
  • score ≥ 45% => 5,
  • ...
  • score ≥ 95% => 10,

where score is calculated using the following rule:

score = 0.7 * Scumulative + 0.3 * Sexam2
cumulative score = 0.8 * Shomework + 0.2 * Sexam1 + 0.2 * Scompetition

  • Shomework – proportion of correctly solved homework,
  • Sexam1 – proportion of successfully answered theoretical questions during exam after module 3,
  • Sexam2 – proportion of successfully answered theoretical questions during exam after module 4,
  • Scompetition – score for the competition in machine learning (it's also from 0 to 1).

Participation in machine learning competition is optional and can give students extra points.
"Automative" passing of the course based on cumulative score may be issued.

Plagiarism

In case of discovered plagiarism zero points will be set for the home assignemets - for both works, which were found to be identical. In case of repeated plagiarism by one and the same person a report to the dean will be made.

Deadlines

Assignments sent after late deadlines will not be scored (assigned with zero score) in the absence of legitimate reasons for late submission which do not include high load on other classes.

Structure of emails and homework submissions

Practical assignments must be implemented in jupyter notebook format, theoretical ones in pdf. Practical assignments must use Python 3 (or Python 3 compatible). Use your surname as a filename for assignments (e.g. Ivanov.ipynb). Do not archive your assignments.

Assignments can be performed in either Russian or English.

Assignments can be submitted only once!

Useful links

Machine learning, Stats, Maths

Python

Python installation and configuration

anaconda