Posts

An introduction to AI agents

What are AI agents?

  • Wikipedia definition: AI agents are a class of intelligent agents that can pursue goals, use tools, and take actions with varying degrees of autonomy.
  • My definition in 2026: AI agents are the interface between human and foundation models.

How are AI agents different from using web chats?

  • Ability to use local tools e.g., read, edit, grep, bash to edit your your local files or run bash Rscript agent_written_r_script.R.
  • Delegate specific projects to specialist subagents.
  • More consistent prompts via skills.

Should I use AI agents?

Compared to just using chat-boxes, using AI agents raise additional risks:

Bounding the approximate interventionist direct effect using causal mediation analysis

We show the usual estimand for natural direct effect lower bounds an interventionist direct effect even if the treatment is not separable.

Does causal mediation analysis need to have an interventionist interpretation?

At the end of Vanessa Didelez’s talk in the Foundations of causal inference workshop at the Isaac Newton Institute today, I asked her whether she would be okay with a mediation publication that only discusses potential interventionist interpretation in the Discussion section of the paper. The immediate reaction from Vanessa was that she does not like this but will not police it. In the discussion afterward, Thomas Richardson showed more disagreement with me. When I raised the potential that requiring scientists to specify separable components of the treatment will make us not popular, Thomas responded that “when I got my academic training we were not asked to be popular” (or something like this). Provoked by this comment, I decided to write a post to articulate my point.

PhD studentship opportunity: Causal Inference in Developmental Psychology

This PhD will explore how recent advancements in statistics and data science can be integrated to modernise developmental research. Specific topics include:

  1. Theory of causal identification and inference based on potential outcomes and graphical models highlights the importance of study design (how data are collected and preprocessed).
  2. Research in epidemiology and biostatistics (e.g. Robins’ g-methods and dynamic treatment regimes) and economics (e.g. instrumental variables, difference-in-differences) provides new methods for observational studies.
  3. Machine learning provides new tools to train flexible statistical models with complex data (possibly unstructured, such as images and texts).

These advancements are underutilized in developmental psychology but hold great potential. Applying advanced statistical methods can transform our understanding about the mechanisms and complex interplay of different factors in psychological and cognitive development.

Macros for graph separations

In a recent paper by Richard Guo and me, we proposed a systematic way to select confounders by eliciting expert opinion. We invented some new notation to distinguish some different types of graph connections/separations. Macros for the new notation can be found in the file below:

Discussion on (quasi-)randomization inference

The following is an email exchange with Hyunseung Kang on the nature of randomization inference. This was sparked by a couple of papers by David Freedman and David Lane: A Nonstochastic Interpretation of Reported Significance Levels; Significance Testing in a Nonstochastic Setting.

Amusing counterfactual inference (by words)

My good friend Joshua Loftus and I spent some 30 minutes to crack (at least we think we did!) a counterfactual inference made in a speech in the House of Commons in London in 1850 by Lord Palmerston, who was the Secretary of State for Foreign Affairs at the time.

The origin of randomization

This post is derived from my talk “Fisher, Statistics, and Randomization” in the Fisher in the 21st Century Conference organized by Fisher’s College, Gonville & Caius. In the first half of that talk, I tried to trace the origin of randomization. Fisher is widely credited as the person who first advocated randomization in a systematic manner. In doing so, he profoundly changed how modern science is being done.

The philosophy behind hypothesis testing

I read a few interesting articles this week on the Fisher-Neyman debate on the foundation of hypothesis testing:

  1. The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?.
  2. Rigorous uncertainty: why RA Fisher is important.
  3. Models and statistical inference: The controversy between Fisher and Neyman–Pearson.

The first paper is written by Erich Lehmann and argues that the practical aspects of the Fisher and Neyman-Pearson approaches to testing statistical hypotheses are “complementary rather than contradictory”. I agree with this verdict, but I think what is more interesting and useful for modern statisticians is the basic philosophical differences. Lehmann summarized this as “inductive inference versus inductive behaviour”:

Statistical Modeling: Returning to its roots

Over this Easter weekend, I wrote the following commentary for the reprinting on Leo Breiman’s paper “Statistical Modeling: The Two Cultures” by Observational Studies. This is partly based on a talk I gave last year.