Rajen Shah


R.Shah@statslab.cam.ac.uk
Room: D1.15
Office phone: +44 1223 765923

I am a Lecturer in Statistics at the Statistical Laboratory, which is part of the Department of Pure Mathematics and Mathematical Statistics at the University of Cambridge. I am also a fellow of the Alan Turing Institute. My research interests include high-dimensional statistics and large-scale data analysis.

Editorial Service

  • Associate Editor for JRSSB.

Research

Core Statisical Publications and Preprints

  • Shah, R. D. and Meinshausen, N. (2018) RSVP-graphs: Fast High-dimensional Covariance Matrix Estimation under Latent Confounding. Preprint. (.pdf).
  • Shah, R. D. and Peters, J. (2018) The hardness of conditional independence and the generalised covariance measure. Preprint. (.pdf).
  • Thanei, G., Meinshausen, N., Shah, R. D. (2018) The xyz algorithm for fast interaction search in high-dimensional data. JMLR, 19, 1-42. (.pdf).
  • Shah, R. D. and Meinshausen, N. (2018) On b-bit min-wise hashing for large-scale regression and classification with sparse data. JMLR, 18, 1-42. (.pdf)
  • Shah, R. D. and Bühlmann, P. (2017) Goodness-of-fit tests for high-dimensional linear models. J. Roy. Statist. Soc., Ser. B. (.pdf) Software: RPtests.
  • Shah, R. D. (2016) Modelling interactions in high-dimensional data with Backtracking. JMLR, 17, 1-31. (.pdf) Software: LassoBacktracking.
  • Shah, R. D. and Samworth, R. J. (2015) Invited discussion of An adaptive resampling test for detecting the presence of significant predictors by McKeague, I. W. and Qian, M. J. Amer. Statist. Assoc., 110, 1439-1442. (.pdf)
  • Chen, Y., Shah, R. D. and Samworth, R. J. (2014) Discussion of Multiscale change point inference by Frick, K., Munk, A. and Sieling, H. J. Roy. Statist. Soc., Ser. B, 76, 544-546. (.pdf)
  • Shah, R. D. and Meinshausen, N. (2014) Random Intersection Trees. JMLR, 15, 629-654. (.pdf) Software: FSInteract.
  • Shah, R. D. and Samworth, R. J. (2013) Invited discussion of Correlated variables in regression: clustering and sparse estimation by Bühlmann, Rütimann, van de Geer and Zhang. Journal of Statistical Planning and Inference, 143, 1866-1868. (.pdf)
  • Shah, R. D. and Samworth, R. J. (2013) Variable selection with error control: Another look at Stability Selection. J. Roy. Statist. Soc., Ser. B, 75, 55-80. (.pdf) Associated R code.
  • Shah, R. D. and Samworth, R. J. (2010) Discussion of Stability Selection by Meinshausen and Bühlmann. J. Roy. Statist. Soc., Ser. B, 72, 455-456. (.pdf)

Interdisciplinary Publications

  • Mitchell, P. D., Brown, R., Wang, T., Shah, R. D., Samworth, R. J., Deakin, S., Edge, P., Hudson, I., Hutchinson, R., Kaur, K., Lacey, E.-K., Latimer, M., Natarajan, R., Qasim, S., Rehm, A., Sanghrajka, A., Tissingh, E. and Wright, G. (2019) Multi-centre study of non-accidental injury and limb fractures in young children in the East Anglia region, UK. Archives of Disease in Childhood, to appear.
  • Bødker, J. S., Brøndum, R. F., Schmitz, A., Schönherz, A. A., Jespersen, D. S., Sønderkær, M., Vesteghem, C., Due, H., Nøgaard C. H., Perez-Andres, M., Samur, M. K., Davies, F., Walker, B., Pawlyn, C., Kaiser, M., Johnson, D., Bertsch, U., Broyl, A., van Duin, M., Shah, R., Johansen, P., Nøgaard, M. A., Samworth, R. J., Sonneveld, P., Goldschmidt. H., Morgan, G. J., Orfao, A., Munshi, N., El-Galaly, T., Dybkær, K. and Bøgsted, M. (2018) A multiple myeloma classification system that associates normal B-cell subset phenotypes with prognosis. Blood Advances, 2, 2400-2411.
  • Dybkær, K., Bøgsted, M., Falgreen, S., Bødker, J. S., Kjeldsen, M. K., Schmitz, A., Bilgrau, A. E., Xu-Monette, Z. Y., Li, L., Bergkvist, K. S., Laursen, M. B., Rodrigo-Domingo, M., Marques, S. C., Rasmussen, S. B., Nyegaard, M., Gaihede, M., Møller, M. B., Samworth, R. J., Shah, R. D., Johansen, P., El-Galaly, T. C., Young, K. H. and Johnsen, H. E. (2015) A diffuse large B-cell lymphoma classification system that associates normal B-cell subset phenotypes with prognosis, J. Clinical Oncology, 33, 1379-1388.

Other notes

  • Sparsity. Workshop on Multivariate Analysis Today 2015. (.pdf) Some slides can be found here.
  • High-dimensional data and the Lasso. Eureka 62, 2013 (.pdf)

Teaching