TAILORED: Theoretical Foundations of Learning with Observational and Interventional Data

NNF Project Grant in the Natural and Technical Sciences 2024 - 2027.
Data collection can be interventional or observational. The aim of interventional data collection is to establish causal mechanisms. However, such data are typically expensive and scarce. Conversely, observational data are typically inexpensive and abundant, but offer little insight into causal mechanisms, because without interventions it is extremely challenging to distinguish statistical correlations from causal relations.
Existing research often examines the two types of data in isolation. We aim to develop new statistical methods that integrate the two data sources in a theoretically grounded manner to mitigate the limitations of each and leverage the strengths of both. Interventional data can help correct biases in observational data, while observational data can expand the applicability of interventional data by providing diversity and scale. Additionally, we aim to derive frameworks for using cheap observational data to guide adaptive collection of expensive interventional data to maximize the information gained.
The research will combine recent developments in causality and reinforcement learning. The former offers a formal approach to modeling the two forms of data collection, while the latter provides a framework for adaptive decision-making. Outcomes of the project will have potential future applications in healthcare, education, agroecology, and many other domains, which can benefit from integration of observational and interventional data. It will open the opportunity to tailor the applications reliably thanks to the richness of observational data and quality of interventional data.