Statistical learning is the process of using data to guide the construction and selection of models, which are then used to predict future outcomes. In this course, which consist of 12 lectures and 12 practical classes, we will examine some of the most successful and widely used statistical methodologies in modern applications. The practical classes will deal with an introduction to R, exploratory data analysis and the implementation of the statistical methods discussed in the lectures. We aim to cover a selection of the following topics:

- Generalised linear models
- Model selection and regularisation
- Mixed effect models and quasi-likelihood methods
- Linear discriminant analysis and support vector machines
- Introduction to neural networks
- Time series and spatial statistics

**Pre-requisites:**
elementary probability theory, maximum likelihood estimation,
hypothesis tests and confidence intervals, linear models. Previous
experience with R is helpful but not essential.

The current version of the lecture notes can be found here. It will be updated after each lecture. Practical texts and solutions can be found in the timetable below. Datasets used in the practicals can be found here.

I will be holding regular office hours on Mondays 11:00-13:00 in F2.07. There are four example sheets for this course.

- Example Sheet 1 and solutions. Example class on 2 Feb 14:00-15:00 in MR14.
- Example Sheet 2 and solutions. Example class on 22 Feb 14:00-15:00 in MR4.

We will use the statistical programming language R in this course. It
is recommended that you also use RStudio , an integrated development
environment (IDE) for R. Both R and RStudio have been pre-installed on
all machines in the CATAM Room (GL.04). You can download them for your
own computer from `http://cran.r-project.org/`

(R programming language) and `http://www.rstudio.com/`

(RStudio).

We will be moving to MR5 for both lecture and practicals due to high demand for the course. Please bring your own laptop with R and RStudio preinstalled from Practical 2 onwards. If MR5 becomes too crowded, we will use the CATAM room for overflow, where desktop machines are available.

Date | Time | Room | Topics and Resources | |
---|---|---|---|---|

L01 | 19 Jan (Fri) | 10:00–11:00 | MR14 | Lecture 1: Generalised linear models |

P01 | 22 Jan (Mon) | 10:00–11:00 | CATAM Room | Practical 1 [text] [soln] |

L02 | 24 Jan (Wed) | 10:00–11:00 | MR5 | Lecture 2: Model selection |

P02 | 26 Jan (Fri) | 10:00–11:00 | MR5/CATAM Room | Practical 2 [text] [soln] |

L03 | 31 Jan (Wed) | 10:00–11:00 | MR5 | Lecture 3: Overdispersion |

P03 | 31 Jan (Wed) | 15:00–16:00 | MR4/CATAM Room | Practical 3 [text] [soln] |

L04 | 2 Feb (Fri) | 10:00–11:00 | MR5 | Lecture 4: Mixed effect models |

P04 | 5 Feb (Mon) | 10:00–11:00 | MR5/CATAM Room | Practical 4 [text] [soln] |

L05 | 9 Feb (Fri) | 10:00–11:00 | MR5 | Lecture 5: Regularised regression |

P05 | 9 Feb (Fri) | 14:00–15:00 | MR4/CATAM Room | Practical 5 [text] [soln] |

L06 | 12 Feb (Mon) | 10:00–11:00 | MR5 | Lecture 6: Linear methods for classification |

P06 | 14 Feb (Wed) | 10:00–11:00 | MR5/CATAM Room | Practical 6 [text] [soln] |

L07 | 16 Feb (Fri) | 10:00–11:00 | MR5 | Lecture 7: Support vector machines |

P07 | 19 Feb (Mon) | 10:00–11:00 | MR5/CATAM Room | Practical 7 [text] [soln] |

L08 | 21 Feb (Wed) | 10:00–11:00 | MR5 | Lecture 8: Neural networks |

L09 | 23 Feb (Fri) | 10:00–11:00 | MR5 | Lecture 9: Nearest neighbour classifiers |

P08 | 26 Feb (Mon) | 10:00–11:00 | MR5/CATAM Room | Practical 8 |

P09 | 28 Feb (Wed) | 10:00–11:00 | MR5/CATAM Room | Practical 9 |

L10 | 2 Mar (Fri) | 10:00–11:00 | MR5 | Lecture 10: Time series: ARIMA models |

P10 | 5 Mar (Mon) | 10:00–11:00 | MR5/CATAM Room | Practical 10 |

L11 | 7 Mar (Wed) | 10:00–11:00 | MR5 | Lecture 11: Time series: estimation and forecast |

P11 | 9 Mar (Fri) | 10:00–11:00 | MR5/CATAM Room | Practical 11 |

L12 | 12 Mar (Mon) | 10:00–11:00 | MR5 | Lecture 12: Spatial statistics |

P12 | 14 Mar (Wed) | 10:00–11:00 | MR5/CATAM Room | Practical 12 |