Introduction to Machine Learning and Data Mining — различия между версиями
Machine (обсуждение | вклад) |
Machine (обсуждение | вклад) |
||
Строка 2: | Строка 2: | ||
'''TAs:''' Ivan Zaputliaev (Module 3 and 4), Alexander Korabelnikov (Module 4). | '''TAs:''' Ivan Zaputliaev (Module 3 and 4), Alexander Korabelnikov (Module 4). | ||
+ | |||
+ | |||
+ | === Homeworks === | ||
+ | |||
+ | Homework 1: Spam classification. | ||
+ | |||
+ | Soft deadline (up to 10 points): <s>March 9<\s> | ||
+ | Hard deadline (-2 points): <s>March 15<\s> | ||
=== Lecture on 23.01.2019=== | === Lecture on 23.01.2019=== | ||
Строка 21: | Строка 29: | ||
Slides: Decision trees. Entropy and information gain. ID3 algorithm. Gini impurity. Tree pruning. | Slides: Decision trees. Entropy and information gain. ID3 algorithm. Gini impurity. Tree pruning. | ||
+ | |||
+ | === Lecture on 06.03.2019 === | ||
+ | |||
+ | Slides: 1. Clustering. K-means, k-medoids, fuzzy c-means. The number of clusters problem and related heuristics. Hierarchical clustering. Density-based clustering: DBscan and Mean-shift. | ||
+ | 2. Spectral Clustering for graph partition. Min-cut, Laplace matrix, Fiedler vector. Bipartite spectral clustering. |
Версия 15:56, 9 марта 2019
Lecturers: Dmitry Ignatov
TAs: Ivan Zaputliaev (Module 3 and 4), Alexander Korabelnikov (Module 4).
Содержание
Homeworks
Homework 1: Spam classification.
Soft deadline (up to 10 points): March 9<\s>
Hard deadline (-2 points): <s>March 15<\s>
Lecture on 23.01.2019
Intro slides.
Practice: demonstration with Orange.
Lecture on 06.02.2019
Slides: Introduction to classification techniques (1-rule, kNN, Naive Bayes, Logistic Regression).
Practice: demonstration with Orange and scikit-learn.
Lecture on 22.02.2019
Practice with scikit-learn (kNN, Naive Bayes, Logistic Regression, basic quality metrics, cross-validation, error plots)
Slides: Decision trees. Entropy and information gain. ID3 algorithm. Gini impurity. Tree pruning.
Lecture on 06.03.2019
Slides: 1. Clustering. K-means, k-medoids, fuzzy c-means. The number of clusters problem and related heuristics. Hierarchical clustering. Density-based clustering: DBscan and Mean-shift. 2. Spectral Clustering for graph partition. Min-cut, Laplace matrix, Fiedler vector. Bipartite spectral clustering.