Data Analysis in Python 2020-2021 — различия между версиями

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск
(Required Software)
(Materials)
Строка 31: Строка 31:
 
  || 1. [https://docs.python.org/3.8/tutorial/ Official Python tutorial & documentation] <br> 2. [https://www.coursera.org/specializations/python Coursera. Python for Everybody Specialization] <br> 3. [https://www.coursera.org/learn/python-crash-course?specialization=google-it-automation Coursera. Crash Course on Python] <br> 4. [https://snakify.org/en/lessons/print_input_numbers/ Snakify. A lot of online exercises in Python] || No assignment this time. Yay! ||
 
  || 1. [https://docs.python.org/3.8/tutorial/ Official Python tutorial & documentation] <br> 2. [https://www.coursera.org/specializations/python Coursera. Python for Everybody Specialization] <br> 3. [https://www.coursera.org/learn/python-crash-course?specialization=google-it-automation Coursera. Crash Course on Python] <br> 4. [https://snakify.org/en/lessons/print_input_numbers/ Snakify. A lot of online exercises in Python] || No assignment this time. Yay! ||
 
|-
 
|-
| style="background:#eaecf0;" | '''2''' || || || || || ||
+
| style="background:#eaecf0;" | '''2''' || Input, output. Numbers, strings. Arithmetical operations || - || [https://github.com/anamarina/Data_Analysis_in_Python/blob/main/week2/week2.ipynb Tutorial] || 1. [https://www.python.org/dev/peps/pep-0008/ PEP8 Style Guide] <br> 2. [https://www.w3schools.com/python/python_numbers.asp Python Numbers Exercises] <br> 3. [https://realpython.com/python-input-output/ Input and Output in Python]|| [https://github.com/anamarina/Data_Analysis_in_Python/blob/main/week2/HA1.ipynb HA1] || 23:59, February 7, 2021
 
|-
 
|-
 
| style="background:#eaecf0;" | '''3''' || ||  ||  ||  || ||
 
| style="background:#eaecf0;" | '''3''' || ||  ||  ||  || ||

Версия 23:20, 31 января 2021

About the course

The course is conducted for students of Bachelor’s Programme 'HSE and University of London Parallel Degree Programme in International Relations'.

Abstract: In this course students are introduced to the rapidly growing field of data analytics with the specific focus on Python programming language. Students will learn concepts, techniques and tools they need to make meaningful inferences from data. Students will be exposed to a real-world data sets to gain practical skills in data manipulations. Each week will involve seminars and coding simulations. In the final project students will build a working code that can be readily applied for exploratory data analysis in their own (future) research domain.

Syllabus: open

Required Software

  • Anaconda (Python version >= 3.8)
  • Jupyter Notebook

How to install Anaconda on Mac OS
How to install Anaconda on Windows

Materials

Presentations and all materials will be available immediately after each practice class. Additional materials will be used in quizzes at each next seminar.

Github with the materials from our practical classes: https://github.com/anamarina/Data_Analysis_in_Python

Week Topic Slides Tutorial Additional Materials Assignment Deadline
1 Introduction Intro Slides How to install Anaconda on Mac OS,

How to install Anaconda on Windows

1. Official Python tutorial & documentation
2. Coursera. Python for Everybody Specialization
3. Coursera. Crash Course on Python
4. Snakify. A lot of online exercises in Python
No assignment this time. Yay!
2 Input, output. Numbers, strings. Arithmetical operations - Tutorial 1. PEP8 Style Guide
2. Python Numbers Exercises
3. Input and Output in Python
HA1 23:59, February 7, 2021
3
4
5
6
7
8
9
10
11
12
13
14

Assignments

The course consists of 8 home assignments (10 pts/each), each of them performed individually. Short home assignments will be published almost every week after Week 2 (weeks 2, 3, 4, 8, 9, 10, 13, 14) based on the materials of the previous practical classes.

There will be held 2 in-class assignments (10 pts/each) in the format of problem-solving tasks and coding in Python using an online platform (e.g. Yandex Contes or Github Classroom). Problem set 1 deals with the basics of working in Python with data types and data structures, problem set 2 involves performing tasks on data exploratory analysis and visualization.

Each task is checked for plagiarism. Matching more than 25% of the code will be considered plagiarism and will result in 1 point out of 10 with the right to appeal. If the code matches more than 40%, the job will be canceled (0 points) without the right to appeal. After the deadline for each assignment, during the next week, each student will be offered a convenient time for her/him for participating in a conference in Zoom with a lecturer and TA to answer questions on code and explanations of solutions.

Assignment title standard: Please, name your files with solutions in this format: Assignment # _ # Number # _ # Group number # _ # Name # _ # Surname #. Example: Assignment_1_BMOL181_Morty_Smith

Github with tutorials and assignments: https://github.com/anamarina/Data_Analysis_in_Python

Links for submitting your assignments: coming soon!

Communication

All course materials, assignments, deadlines will be published on this page.

Important announcements from the teaching team will be sent in Telegram channel: https://t.me/joinchat/UctGNtxs7zd4StM0

The group with 24/7 online support in Telegram for Q&A, discussions, technical issues, and moral support: https://t.me/joinchat/F_uIPvGE_zA8fftG

Tutor: Marina Ananyeva Email Telegram

Group Schedule
195 Tuesday 9.30-10.50
191 Tuesday 11.10-12.30
196 Tuesday 13.00-14.20
194 Thursday 9.30-10.50
192 Thursday 11.10-12.30
193 Thursday 13.00-14.20

Feedback

We’ll much appreciate it if you help us to make this course better by sharing your ideas and feedback. Feel free to do it!

Anonymous feedback form: click_here

Grading

Final Grade = 0.4*home assignments + 0.3*group project + 0.2*in-class assignments + 0.1*in-class participation

Table with grades

In-class participation (10 pts) The activity during the class is graded by one point per seminar. It implies providing answers to the questions, solving tasks during the seminar. In case a student get more than 10 points in total, its rounded down to 10.

Group project (10 pts) Maximum group size: 4 students. Group project evaluation criteria: • the purpose of the study is clearly stated (1 point); • all steps of the research process are described in a clear and concise way (2 points); • research outcomes are clearly defined (2 points); • includes intuitive visualizations of research outcomes (2 points); • all members of the project team are able to explain the code used for computations (1 points); • code is properly structured (1 point); • meets submission timeline (1 point).

Home assignments (10 pts/each) – week 2, 3, 4, 8, 9, 10, 13, 14 A home assignment will be given 8 times during the course. These assignments are problem sets that are to be solved in Python. Sample problems: • Open file data.csv using pandas and find out whether it contains missing variables. If it does, remove them. Create a new column with boolen values (True or False) using condition by column Age: if age < 18 – return False, otherwise return True.

In-class assignments (10 pts/each) – week 6, 11 An in-class assignment will be given two times during the course. In-class assignments are problem sets that are to be solved in Python. Each problem set concerns a particular topic. Problem set 1 deals with the basics of working in Python with data types and data structures, problem set 2 involves performing tasks on data exploratory analysis and visualization. Sample problems: • Generate a list of even numbers in a range from 0 to 100. Iterate over these numbers in a for-loop and print each of it. • Consider the daily oil prices and the USDRUB daily exchange rate. Compute the sample average, standard deviation of daily returns over the entire sample period. Test if mean values are significantly different from zero. Test if mean values significantly differ from each other. State explicitly your null and alternative hypotheses in each case. Plot histograms of the null distributions.

Cheating and honor

You must abide by the Honor Code.

Please don’t cheat - the rumor has it HSE has quite severe penalties.

To avoid being accused of plagiarism in “grey cases”, please disclose with whom and how you have collaborated on each assignment, except for the final group project. If you warn us, the worst thing that can happen to you after a good-faith mistake is to ask you to complete another version of the task, without disciplinary action and without notifying the HSE administration.