Who dies from COVID-19? Post-hoc explanations of mortality prediction models using coalitional game theory, surrogate trees, and partial dependence plots
Abstract
As of early June, 2020, approximately 7 million COVID-19 cases and 400,000 deaths have been reported. This paper examines four demographic and clinical factors (age, time to hospital, presence of chronic disease, and sex) and utilizes Shapley values from coalitional game theory and machine learning to evaluate their relative importance in predicting COVID-19 mortality. The analyses suggest that out of the 4 factors studied, age is the most important in predicting COVID-19 mortality, followed by time to hospital. Sex and presence of chronic disease were both found to be relatively unimportant, and the two global interpretation techniques differed in ranking them. Additionally, this paper creates partial dependence plots to determine and visualize the marginal effect of each factor on COVID-19 mortality and demonstrates how local interpretation of COVID-19 mortality prediction can be applicable in a clinical setting. Lastly, this paper derives clinically applicable decision rules about mortality probabilities through a parsimonious 3-split surrogate tree, demonstrating that high-accuracy COVID-19 mortality prediction can be achieved with simple, interpretable models.
Related articles
Related articles are currently not available for this article.