Causal Inference with Observational Data: Common Designs and Statistical Methods (2025)
Course description
Observational studies are non-interventional empirical investigations of causal effects and are playing an increasingly vital role in healthcare decision making in the era of data science. The study design is particularly important in planning observational studies due to the lack of randomization. Aspects of design include defining the objectives and context under investigation, collecting the right data, and choosing suitable strategies to remove bias from measured and unmeasured confounders. Statistical analysis should also align with the design.
This module covers key concepts and useful methods for designing and analyzing observational studies in 6 sessions:
- Randomized experiments and randomization inference.
- Matching and balancing weights for observational studies.
- Sensitivity analysis; Intro to DAG models.
- Estimation methods under no unmeasured confounding.
- Instrumental variables & Mendelian randomization.
- Difference-in-differences; time-varying exposures.
Target audiences for this module include
- clinical researchers who need to use observational data to generate evidence of causality;
- biostatisticians who are interested in understanding how causal inference can be reliably made in practice.
Background in statistical inference and some knowledge of R are
recommended.
General information
- Instructors: Ting Ye, Qingyuan Zhao.
- Teaching assistant: Yuhan Qian.
- Time: July, 23-25, 2025.
- SISCER page.
- You should have access to the Slack channel for this module. If not, please contact us.
- Lectures will be delivered via Zoom and be recorded. The recordings will be posted to the Slack channel. Practical sessions will not be recorded.
Teaching materials
- Day 1: Randomization inference; Matching & balancing weights. ( Logistics; Lecture 1; Practical 1; pmed.1002412.s002.RData; Lecture 2; Practical 2; optmatch.R; nhanesi_class_dataset.csv)
- Day 2: Sensitivity analysis; Introduction to DAG models; Estimation methods under no unmeasured confounding. ( Lecture 3; Practical 3; Lecture 4; Practical 4)
- Day 3: Confounder selection; Instrumental variables; Difference-in-differences; Time-varying treatments. ( Lecture 5; Practical 5; Lecture 6; Practical 6)
Computing environment
Before the module starts, please ensure that you have installed the latest version of R. We also recommend you use an integrated development environment like RStudio.