Machine learning 1 DSBA 2024/2025

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск

A more detailed information can be found in Syllabus for HSE Machine Learning course

Course Schedule

DSBA Machine Learning 1, ICEF Machine Learning 2024-2025. This syllabus is shared by 2 programs (with differences pointed out):

  • DSBA (ПАД ФКН): Fall: 9/1-12/30, Calendar
  • ICEF (МИЭФ): Fall: 9/1-4/30, Spring: 1/9-4/30, Calendar

Teachers and Assistants

Role DSBA ICEF DSBA ICEF
Lecturer Alexey Boldyrev Maksim Karpov
Seminar Assistants Sara Ali Kirill Bykov Tigran Ramazyan Majid Sohrabi
Teaching Assistants Maria Rozaeva Nikita Aksenov Egor Bugaev Anna Vasileva

Useful links

  1. Moodle LMS (a.k.a. Smart LMS): for posting weekly material, additional videos, calendar, and for posting/collecting/grading quizzes, tests, and HW, etc.
  2. Google Drive: for release of seminar Colab notebooks, lecture presentation slides, Starter Colab files for Kaggle competitions.
  3. Google Colaboratory: for individual manual-graded assignments, group Kaggle assignments and reproducible seminar’s notebooks.
  • We require the use of LaTeX and Markdown syntax for all write ups.
  1. Kaggle.com: for data science competitions in teams of 1-3 students.
  • Ensure that your name & email matches exactly to those in Moodle LMS or we can lose you in grade matching. Update the profile with your presentable photo to help us authenticate you and credit your effort.

Optional Systems (yet maintained by Teaching Team)

  1. Telegram Сhannel: for the course announcements (HW, timetable changes, etc.).
  2. Discourse.hsemlp.ru: for all discussions beyond office hours and which are irrelevant in the comments to the Telegram channel announcements.
  • Invitations were sent out on Sep 2.
  • Ensure that your name & email matches exactly to those in Moodle LMS or we can lose you in grade matching. Update the profile with your presentable photo to help us authenticate you.
  1. DataCamp.com: for additional preparatory courses in the first few weeks.
  • Invitations were sent out on Sep 2.

Course Description

This course introduces the students to the elements of machine learning, including supervised and unsupervised methods such as linear and logistic regressions, splines, decision trees, support vector machines, bootstrapping, random forests, boosting, regularized methods.

  • The first two modules (Sep-Dec`24) DSBA and ICEF students apply Python programming language and popular packages to investigate/visualize datasets and develop machine learning models that solve theoretical and data-driven problems.
  1. The course aims to help the students to develop an understanding of learning from data, to familiarize them with a wide variety of algorithmic and model based methods to extract information from data, teach to apply and evaluate suitable methods to various datasets by model selection and predictive performance evaluation.
  2. DSBA and ICEF students: the course is designed to prepare DSBA/ICEF students for the upcoming University of London (UoL) examination.

Grading System

Grades: 💼 Homework +❓ Quizzes + 💯 Exams

Grading formula is stated in the official course home page (DSBA Machine Learning 1, ICEF Machine Learning). Moodle’s gradebook shows your up to date performance, including your current constituent and aggregate grades. If you suspect an error in grade calculation, please let us know ASAP. We use natural grade aggregation in LMS. Rounding to the nearest integer is used to report 0-10 scale grades to HSE. There are no blocking grading components at the DSBA program, at the ICEF program the Exam in April is blocking.

Homework (HW) Assignment

  • HW is released biweekly with about a 10-13 days deadline, due on Wednesdays, unless otherwise stated.
  • HW is released via Moodle/Smart LMS and is also announced in the Telegram Channel.
  • Individual: These are individualized (not in groups!) assignments of two kinds
    • Auto-graded assignments (thanks to STACK plugin with Maxima CAS backend). Some written responses will be selectively hand-graded by TAs.
    • Quick Maxima syntax is provided below and in each assignment.
    • Manual-graded assignments include analysis of datasets, analytical and conceptual problems, and programming assignments.
    • Submit HW via Moodle/Smart LMS as both shared links to Google Colab and derived PDF.
      • All text explanations must be written in Markdown cells directly in Google Colab notebook.
      • Graders leave feedback in PDF and execute Google Colab to reproduce your results.
  • Group: Students are self-assigned into groups to compete on Kaggle.com. Teamwork is evaluated by instructors. Team’s grade is awarded to everyone on a team. If you thought you did the entire assignment, select other participants in the next group HW.

Midterm Test and Exam

  1. We will have a cumulative in-class midterm test/exam in the middle and at the end of the semester (during the HSE examination sessions). Do not book travel that conflicts with test dates.
  2. Moodle/Smart LMS based.
  3. Tests and exams are individual, i.e. no collaboration. Generative models, web searching, lecture and/or seminar materials, and textbooks are not allowed.
  4. Test questions are drawn from quiz banks, not HW. HW deepens your understanding, but a test measures it.
  5. There is a free navigation between questions, i.e. you can move back or forward the test questions.
  6. Results will be announced after the midterm test/exam within 5 working days.

University of London (UoL), Course ST3189 (ML)

1. Coursework Project in Python (or R) programming language is for DSBA/ICEF students only and is administered by LSE/UoL. It is released around January and is due around April 1\. Although students are given a 3-4 months window, this exercise is meant to be completed in a few days. Typically, students work on it in Feb/Mar. More details on the UoL site.

5-10 minute Quizzes during seminar in LMS \- biweekly

  1. Only students present in the classroom during the seminar are allowed to take the quiz. At the beginning of the seminar the seminar assistant will remove absent student(s) from the group list in moodle settings for the current quiz.
  2. Quizzes are based on lectures, seminars, textbooks, posted videos, and other material delivered via our course.
  3. Quizzes individualized (shuffled and sampled from question banks) for each student. Most questions are auto-generated.
  4. All choice questions are [multiple-choice](https://docs.moodle.org/311/en/Multiple_Choice_question_type#Multiple-answer_questions) (regardless of singular/plural formulation). Incorrect answers lower your score to prevent guessing.
  5. Numeric answers are typically accepted with 0.01=1% of error, i.e. round to at least 4 decimal places, if needed. Please do not round any intermediate calculations. It’s best to use as many decimals as fits in the answer box.
  6. The quiz answers are released after all groups write the quiz .
  7. We always use [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) (inverse of exp()) in this course.

Deadline Extensions and Makeup

  1. Only valid verifiable excuses are accepted for 1-2 day extensions.
  2. If you miss a deadline (with a valid/verifiable excuse), contact instructors ASAP in a private post to arrange a new deadline.
  3. Submissions are not allowed after the solutions have been released.
  4. Any test/exam retakes will be rescheduled as per university policy (see also dedicated rubric below)
  5. Note: accommodating exceptions is difficult and time consuming. Typically, a verifiable medical emergency is a valid reason, but travel and conferences are not. It is the student's responsibility to start their work early, so as to hedge against any unforeseeable life event.