Administrative Information
Title | Exploratory Data Analysis |
Duration | 60 |
Module | A |
Lesson Type | Lecture |
Focus | Technical - Foundations of AI |
Topic | Exploratory Data Analysis |
Keywords
Data exploration,T-SNE,PCA,
Learning Goals
- Learner knows the basic chart types and knows when to use them.
- Learner can use visualisations to investigate a variables distribution .
- Learner can check for dependencies between variables by using visualisation.
- Learner is able to visualise a high dimensional dataset using PCA and T-SNE.
Expected Preparation
Learning Events to be Completed Before
None.
Obligatory for Students
- Read the blog Using T-SNE in Python to Visualize High-Dimensional Data Sets (alternative)
- Read chapter 4 of Python Data Science Handbook
Optional for Students
None.
References and background for students
None.
Recommended for Teachers
- On visualising high dimensional datasets: Using T-SNE in Python
- Chapter 4 of Python Data Science Handbook
Lesson materials
Instructions for Teachers
This lecture focuses on data visualisation as part of the Exloratory Data Analysis (EDA) process. Hence it does not cover data visualisation for, for example, story telling and presentations.
Topics to cover
- Introduction to data visualisation (5 min)
- Goals of data visualisations (EDA, storytelling)
- Which chart to use for which problem (15 min)
- Chart types and their usage
- How to pick the right chart
- Do's and Dont's
- Check for (in)dependent variables (10 min)
- Visualising high dimensional data (20 min)
- PCA
- T-SNE
Acknowledgements
The Human-Centered AI Masters programme was Co-Financed by the Connecting Europe Facility of the European Union Under Grant №CEF-TC-2020-1 Digital Skills 2020-EU-IA-0068.