Abstract: Most of the studies in medicine, economics and social science are motivated by
causal questions rather than associational ones. Examples include the evaluation of the effect of a treatment on time to recovery and the effect of school resources on student achievement. Randomized experiments have been considered the “gold standard” for estimating causal effects. However, economic and ethical limitations make randomized experiments not always viable and only observational data is accessible. Despite its potential, a major limitation of observational data is the presence of confounding factors, i.e., factors that are related to both the treatment and the outcome under study. Inverse probability weighting methods, which control for confounding by weighting each subject under study by the inverse of their probability of being treated given covariates, have been widely used to estimate causal effects from observational data. However, these methods are highly sensitive to misspecification of the treatment assignment model and can lead to low precision due to extreme weights.

In this talk, I will present recent and ongoing optimization-based approaches that address
the above limitations and provide optimal weights in case of cross-sectional (at a specific
point in time) and longitudinal (over time) observational data. I will show the applicability of
these methods on the evaluation of the effect of treatment initiation on treatment efficacy in
patients infected with human immunodeficiency virus. In conclusion, I will present possible
connections between the biostatistics literature on causal inference and the literature of
reinforcement learning.

Bio: Michele Santacatterina is a postdoctoral researcher at the Cornell TRIPODS center for
data science for improved decision making. He received a PhD in biostatistics from
Karolinska Institutet - Sweden. His research is centered around the development and
application of statistical methods for optimal decision making using experimental and
observational data.