Statistical Learning in Practice, Lent 2018

Statistical learning is the process of using data to guide the construction and selection of models, which are then used to predict future outcomes. In this course, which consist of 12 lectures and 12 practical classes, we will examine some of the most successful and widely used statistical methodologies in modern applications. The practical classes will deal with an introduction to R, exploratory data analysis and the implementation of the statistical methods discussed in the lectures. We aim to cover a selection of the following topics:

Pre-requisites: elementary probability theory, maximum likelihood estimation, hypothesis tests and confidence intervals, linear models. Previous experience with R is helpful but not essential.

Resources

The current version of the lecture notes can be found here. It will be updated after each lecture. Practical texts and solutions can be found in the timetable below. Datasets used in the practicals can be found here.

I will be holding regular office hours on Mondays 11:00-13:00 in F2.07. There are four example sheets for this course.

We will use the statistical programming language R in this course. It is recommended that you also use RStudio , an integrated development environment (IDE) for R. Both R and RStudio have been pre-installed on all machines in the CATAM Room (GL.04). You can download them for your own computer from http://cran.r-project.org/ (R programming language) and http://www.rstudio.com/ (RStudio).

We will be moving to MR5 for both lecture and practicals due to high demand for the course. Please bring your own laptop with R and RStudio preinstalled from Practical 2 onwards. If MR5 becomes too crowded, we will use the CATAM room for overflow, where desktop machines are available.

Timetable


Date Time Room Topics and Resources
L01 19 Jan (Fri) 10:00–11:00 MR14 Lecture 1: Generalised linear models
P01 22 Jan (Mon) 10:00–11:00 CATAM Room Practical 1 [text] [soln]
L02 24 Jan (Wed) 10:00–11:00 MR5 Lecture 2: Model selection
P02 26 Jan (Fri) 10:00–11:00 MR5/CATAM Room Practical 2 [text] [soln]
L03 31 Jan (Wed) 10:00–11:00 MR5 Lecture 3: Overdispersion
P03 31 Jan (Wed) 15:00–16:00 MR4/CATAM Room Practical 3 [text] [soln]
L04 2 Feb (Fri) 10:00–11:00 MR5 Lecture 4: Mixed effect models
P04 5 Feb (Mon) 10:00–11:00 MR5/CATAM Room Practical 4 [text] [soln]
L05 9 Feb (Fri) 10:00–11:00 MR5 Lecture 5: Regularised regression
P05 9 Feb (Fri) 14:00–15:00 MR4/CATAM Room Practical 5 [text] [soln]
L06 12 Feb (Mon) 10:00–11:00 MR5 Lecture 6: Linear methods for classification
P06 14 Feb (Wed) 10:00–11:00 MR5/CATAM Room Practical 6 [text] [soln]
L07 16 Feb (Fri) 10:00–11:00 MR5 Lecture 7: Support vector machines
P07 19 Feb (Mon) 10:00–11:00 MR5/CATAM Room Practical 7 [text] [soln]
L08 21 Feb (Wed) 10:00–11:00 MR5 Lecture 8: Neural networks
L09 23 Feb (Fri) 10:00–11:00 MR5 Lecture 9: Nearest neighbour classifiers
P08 26 Feb (Mon) 10:00–11:00 MR5/CATAM Room Practical 8 [text] [soln]
L10 28 Feb (Wed) 10:00–11:00 MR5 Lecture 10: Time series: ARIMA models
P09 2 Mar (Fri) 10:00–11:00 MR5/CATAM Room Practical 9 [text] [soln]
L11 5 Mar (Mon) 10:00–11:00 MR5 Lecture 11: Time series: estimation and forecast
P10 7 Mar (Wed) 10:00–11:00 MR5/CATAM Room Practical 10 [text] [soln]
L12 9 Mar (Fri) 10:00–11:00 MR5 Lecture 12: Spatial statistics
P11 12 Mar (Mon) 10:00–11:00 MR5/CATAM Room Practical 11 [text] [soln]
P12 14 Mar (Wed) 10:00–11:00 MR5/CATAM Room Practical 12 [text] [soln]