Teaching - Modern Statistical Methods


The remarkable development of computing power and other technology now allows scientists and businesses to routinely collect datasets of immense size and complexity. Most classical statistical methods were designed for situations with many observations and a few, carefully chosen variables. However, we now often gather data where we have huge numbers of variables, in an attempt to capture as much information as we can about anything which might conceivably have an influence on the phenomenon of interest. This dramatic increase in the number variables makes modern datasets strikingly different, as well-established traditional methods perform either very poorly, or often do not work at all.

Developing methods that are able to extract meaningful information from these large and challenging datasets has recently been an area of intense research in statistics, machine learning and computer science. In this course, we will study some of the methods that have been developed to study such datasets.


Announcements

Arrangements for examples classes are given here.

We will have an extra lecture on Wednesday 28 November at 2pm in MR9.

Initial meetings for Part III essays will be held on Wednesday 28 November in MR21.

Timings are 3.30pm for 'Statistical inference using machine learning methods' and 4pm for 'Recent developments in false discovery rate control'.


Resources


Code for Demonstrations

The code for the demonstrations is written in R. Rstudio is a useful editor for R. Here are some introductory worksheets on R: Sheet 1, (solutions); Sheet 2, (solutions). The code for the demonstrations is given below.


Example Sheets


Comments and Questions