In epidemiology, causal inference and prediction modeling methodologies have been historically distinct. Directed Acyclic Graphs (DAGs) are used to model a priori causal assumptions and inform variable selection strategies for causal questions. Although tools originally designed for prediction are finding applications in causal inference, the counterpart has remained largely unexplored. The aim of this theoretical and simulation-based study is to assess the potential benefit of using DAGs in clinical risk prediction modeling.
The results show that a single-predictor model in the causal direction is likely to have better transportability than one in the anticausal direction in some scenarios. We empirically show that the Markov Blanket, the set of variables including the parents, children, and parents of the children of the outcome node in a DAG, is the optimal set of predictors for that outcome.
These findings provide a theoretical basis for the intuition that a diagnostic clinical risk prediction model including causes as predictors is likely to be more transportable. Furthermore, using DAGs to identify Markov Blanket variables may be a useful, efficient strategy to select predictors in clinical risk prediction models if strong knowledge of the underlying causal structure exists or can be learned.
In a current application, we have proposed a causal framework to investigate the transportability of prediction models on Alzheimer's disease in simulated external settings with different distributions of demographic and clinical characteristics. In an ongoing follow-up project, we are investigating the transportability of prediction models on Alzheimer's disease empirically using different populations from studies in the US and South Korea.
Further, we are focusing on prognostic clinical risk prediction models for endometriosis. We are performing a systematic review of existing prediction models, externally validating them on data from UK Biobank, Mount Sinai, health insurance datasets, and NAKO data, and then updating and further developing them.
References:
- Piccininni M, Konigorski S, Rohmann JL, Kurth T (2020). Directed Acyclic Graphs and causal thinking in clinical risk prediction modeling. BMC Medical Research Methodology 20: 179. https://doi.org/10.1186/s12874-020-01058-z.
- Fehr J, Piccininni M, Kurth T, Konigorski S (2023). Assessing the transportability of clinical prediction models for cognitive impairment using causal models. BMC Medical Research Methodology23:187. https://doi.org/10.1186/s12874-023-02003-6
Team:
Collaboration partners: