Rajen Shah

Room: D1.15
Office phone: +44 1223 765923

I am a Lecturer in Statistics at the Statistical Laboratory, which is part of the Department of Pure Mathematics and Mathematical Statistics at the University of Cambridge. I am also a fellow of the Alan Turing Institute. My research interests include high-dimensional statistics and large-scale data analysis.

Editorial Service

  • Associate Editor for JRSSB.

Publications and Preprints

  • Shah, R. D. and Meinshausen, N. (2018) RSVP-graphs: Fast High-dimensional Covariance Matrix Estimation under Latent Confounding. Preprint. (.pdf).
  • Shah, R. D. and Peters, J. (2018) The hardness of conditional independence and the generalised covariance measure. Preprint. (.pdf).
  • Thanei, G., Meinshausen, N., Shah, R. D. (2018) The xyz algorithm for fast interaction search in high-dimensional data. JMLR, 19, 1-42. (.pdf).
  • Shah, R. D. and Meinshausen, N. (2018) On b-bit min-wise hashing for large-scale regression and classification with sparse data. JMLR, 18, 1-42. (.pdf)
  • Shah, R. D. and Bühlmann, P. (2017) Goodness of fit tests for high-dimensional linear models. J. Roy. Statist. Soc., Ser. B. (.pdf) Software: RPtests.
  • Shah, R. D. (2016) Modelling interactions in high-dimensional data with Backtracking. JMLR, 17, 1-31. (.pdf) Software: LassoBacktracking.
  • Shah, R. D. and Samworth, R. J. (2015) Invited discussion of An adaptive resampling test for detecting the presence of significant predictors by McKeague, I. W. and Qian, M. J. Amer. Statist. Assoc., 110, 1439-1442. (.pdf)
  • Dybkær, K., Bøgsted, M., Falgreen, S., Bødker, J. S., Kjeldsen, M. K., Schmitz, A., Bilgrau, A. E., Xu-Monette, Z. Y., Li, L., Bergkvist, K. S., Laursen, M. B., Rodrigo-Domingo, M., Marques, S. C., Rasmussen, S. B., Nyegaard, M., Gaihede, M., Møller, M. B., Samworth, R. J., Shah, R. D., Johansen, P., El-Galaly, T. C., Young, K. H. and Johnsen, H. E. (2015) A diffuse large B-cell lymphoma classification system that associates normal B-cell subset phenotypes with prognosis, J. Clinical Oncology, 33, 1379-1388.
  • Chen, Y., Shah, R. D. and Samworth, R. J. (2014) Discussion of Multiscale change point inference by Frick, K., Munk, A. and Sieling, H. J. Roy. Statist. Soc., Ser. B, 76, 544-546. (.pdf)
  • Shah, R. D. and Meinshausen, N. (2014) Random Intersection Trees. JMLR, 15, 629-654. (.pdf) Software: FSInteract.
  • Shah, R. D. and Samworth, R. J. (2013) Invited discussion of Correlated variables in regression: clustering and sparse estimation by Bühlmann, Rütimann, van de Geer and Zhang. Journal of Statistical Planning and Inference, 143, 1866-1868. (.pdf)
  • Shah, R. D. and Samworth, R. J. (2013) Variable selection with error control: Another look at Stability Selection. J. Roy. Statist. Soc., Ser. B, 75, 55-80. (.pdf) Associated R code.
  • Shah, R. D. and Samworth, R. J. (2010) Discussion of Stability Selection by Meinshausen and Bühlmann. J. Roy. Statist. Soc., Ser. B, 72, 455-456. (.pdf)
  • Shah, R. D. (2014) Topics in High-dimensional and Large-scale Data Analysis. PhD thesis. (.pdf)
  • Shah, R. D. (2010) High-dimensional variable selection. Part III Essay.

Other notes

  • Sparsity. Workshop on Multivariate Analysis Today 2015. (.pdf) Some slides can be found here.
  • High-dimensional data and the Lasso. Eureka 62, 2013 (.pdf)