Statistical Laboratory Seminars


Michaelmas Term 1998

Room S27, Statistical Laboratory
University of Cambridge
16 Mill Lane, Cambridge, CB2 1SB


Tel: (01223) 337958 Fax: (01223) 337956
Email: secretary@statslab.cam.ac.uk

All interested are welcome


Wednesday, 14 October

2.00pm

Paul Embrechts (ETH, Zurich)

RANDOM RECURRENCE EQUATIONS AND EXTREMES


Friday, 16 October [TWO seminars]

2.00 p.m.

Matthew Stephens (Oxford)

BAYESIAN ANALYSIS OF MIXTURES WITH AN UNKNOWN NUMBER OF COMPONENTS - AN ALTERNATIVE TO REVERSIBLE JUMP METHODS

Mixture distributions are typically used to model data in which each observation is assumed to have arisen from one of a number of different groups. They also provide a convenient and flexible class of models for density estimation. The analysis of such models has a long and distinguished history, and continues to attract interest and present difficulties. In this talk we will briefly review some of these difficulties before concentrating on the particularly difficult problem of deciding how many components to use in a mixture model. Richardson and Green (1997) consider a Bayesian approach to this problem using a Markov Chain Monte Carlo (MCMC) approach, which makes use of the ``reversible jump'' methodology described by Green (1995). We describe an alternative MCMC method which views the parameters of the model as a (marked) point process, extending methods suggested by Ripley (1977) to create a continuous time Markov birth-death process with an appropriate stationary distribution. We illustrate our method on both univariate and bivariate data, make some comparisons with the reversible jump methodology, and describe some general difficulties with choosing suitable priors for the model parameters.

KEYWORDS: Bayesian methods, Birth-death process, Markov chain Monte Carlo, Mixture model.

3.30 p.m.

Mark Kelbert (Swansea)

RANGE OF FLUCTUATIONS AND INTERSECTION PROPERTIES OF BROWNIAN MOTION ON RIEMANNIAN MANIFOLDS

In 1914, Hardy and Littlewood studied the upper function for the sums of i.i.d. Bernoulli random variables. In 1923, Khinchin improved their result and obtained what is now called the law of iterated logarithm. In view of Khinchin's result, Hardy and Littlewood's estimate has long been considered as one of only historical value. However, we have recently proved (co-authored by Grigor'yan, Imperial College) that Hardy-type inequality is valid for Brownian motion on manifolds from a rather general class of bounded geometry and of polynomial volume growth. What is more interesting, this estimate is sharp on this class. Moreover, we generalize some well-known results by Dvoretzky, Erdos and Kakutani, 1950-1952, about non-intersection of Brownian traces. We find conditions for their separation in terms of heat kernel long time decay on manifolds.


Friday, 23 October

2.05 p.m.

Tobias Ryden (Lund University)

HIDDEN MARKOV MODELS: INFERENCE AND APPLICATIONS

A hidden Markov model (HMM) consists of a non-observable finite state Markov chain $(X_n)$ and an observable process $(Y_n)$ such that the distribution of $Y_n$ is governed by $X_n$. HMMs have been applied in a wide range of areas including neurophysiology, hydrology, economics, finance, signal processing and communication systems. In this talk I will outline some fundamental properties of HMMs, and discuss statistical problems they generate and the solutions to some of these problems.


Friday, 30 October [TWO seminars]

2.05 p.m.

David Siegmund (Stanford)

SEARCHING FOR SIGNALS AGAINST A NOISY BACKGROUND

Problems in medical imaging, astrophysics, gene mapping, and sequence alignment involve observations of a random field $Z_u$ for $u\in U$, where $U$ is a subset of $r$ dimensional Euclidean space (often $r=1,2$, or 3). Over most of the space $U$, $Z_u$ is random noise; but in a relatively small part of the space it may contain a signal that one wants to detect and locate as precisely as possible. A natural formulation leads to a class of irregular statistical problems, where standard large sample maximum likelihood theory and the asymptotic equivalence of maximum likelihood and Bayesian procedures may not hold.

In this talk I will describe the scientific background of some of these problems and show that their common features make it possible to develop a unified theoretical approach for signal detection and localization. In particular, I will describe approximations for the p-value of the likelihood ratio test for signal detection, which involves the maximum of a random field. An important distinction is between processes where $Z_u$ has smooth sample paths and process where it does not. While probabilistic methods play an important role in both cases, geometric methods can be particularly useful in the first case.

3.30 p.m.

Marie South (Zeneca Manufacturing Partnership)

WHERE THEORY MEETS PRACTICE - MAKING STATISTICS MAKE SENSE IN INDUSTRY

Developing a career in statistics is a little like learning to drive a care - much of the important learning takes place after you've passed the qualifying test. This talk explores the opportunities and challenges facing statisticians working in industry, and the skills needed which you're unlikely to learn as part of a university course. The talk will be illustrated with industrial case studies, supplemented with experiences taken from work in medical research and consultancy for charity. There will be no new formulae or theory, but plenty to make you think about life after an M.Phil. in Statistics!


Friday, 13 November

2.05 p.m.

Paul Damien (U. of Michigan Business School)

BAYESIAN NONPARAMETRIC INFERENCE FOR RANDOM DISTRIBUTIONS AND RELATED FUNCTIONS

In recent years Bayesian nonparametric inference, both theoretical and computational, has witnessed considerable advances. In this talk, we will discuss and illustrate the rich modelling and analytic possibilities available to the statistician within the Bayesian nonparametric and/or semiparametric framework.


Friday, 20 November [TWO seminars]

2.05 p.m.

Sergei Zuyev (Strathclyde)

STOCHASTIC GEOMETRY AND MODELLING OF COMMUNICATIONS NETWORKS

We discuss the approach to modelling of architecture of large telecommunication networks based on stochastic geometry that has been developed in the framework of the research contract between France Telecom and INRIA for the last few years. We will show how recent tools of stochastic geometry and point processes allow for an analytical treatment of rather complex models of networks, stationary or mobile. Variation analysis technique developed for Poisson processes gives rise to new steepest descent algorithms that may be useful for numerical estimation of the optimal configuration in the cases when analytical results are hardly achievable.

3.30 p.m.

Damon Wischik (Cambridge)

SAMPLE PATH LARGE DEVIATIONS FOR QUEUEING NETWORKS WITH MANY TRAFFIC FLOWS

Large deviations can be used to estimate the probability of rare events, such as buffer overflow, in queueing networks. It are simple enough that it can be applied to very general traffic models, and sophisticated enough to give insight into complex behaviours. In this talk I will give an abstract large deviations principle for the average of many independent processes, and apply it to queues with many input flows. Queues with finite and infinite buffers, priority queues, and most likely paths to overflow will be described. I will also show how the analysis of networks of queues is made surprisingly simple.


Friday, 27 November [TWO seminars]

2.05 p.m.

R.A. Goldstein (Michigan and Isaac Newton Institute)

EVOLUTIONARY PERSPECTIVES ON PROTEIN FOLDING AND STRUCTURE

Proteins are the result of a long evolutionary process; in order to understand their properties, we have to address this evolutionary heritage. One constraint placed on all proteins is that they must be able to fold into a compact, regular shape, a non-trivial process. We have developed simple theoretical and computational models to understand how protein properties are determined from this constraint. For instance, by considering protein structure designability, we can understand why certain protein structures are so over-represented among biological proteins. We can also understand why natural proteins are marginally stable and marginally foldable, and why they generally fold into the conformation of lowest free energy. These results can be substantially affected by the population-dynamical aspects of evolution.

3.30 p.m.

Sergein Novak (Sussex)

GENERALISED KERNEL DENSITY ESTIMATOR

We introduce a new class of nonparametric density estimators. It includes the classical kernel density estimators as well as popular Abramson's estimator. The asymptotics of the mean squared error, optimal kernel and smoothing parameter are found.


Friday, 4 December

2.05 p.m.

David Balding (Reading and Isaac Newton Institute)

INFERENCES ABOUT POPULATION HISTORY FROM DNA SEQUENCE DATA

This is an exciting time for those interested in what can be `read from the genes' about histories of humans and other species. Two key developments are now coming together: (1) appropriate stochastic models for the genealogical trees underlying samples of DNA sequences, and (2) MCMC technology in general, but also specifically for parameter spaces involving trees. I will review and illustrate these developments. The bad news is that the underlying reality is so complex that strong modelling assumptions are required, and even then there is typically large uncertainty about inferences of interest. However, some interesting inferences can be made which seem to be reasonably robust to plausible modelling assumptions.


Susan Pitts, Organizer