Development and evaluation of a machine learning-based in-hospital COvid-19 Disease Outcome Predictor (CODOP): a multicontinental retrospective study
Abstract
Background
More contagious SARS-CoV-2 virus variants, breakthrough infections, waning immunity, and sub-optimal rates of COVID-19 vaccination account for a new surge of infections leading to record numbers of hospitalizations and deaths in several European countries. This is a particularly concerning scenario for resource-limited countries, which have a lower vaccination rate and fewer clinical tools to fight against the next pandemic waves. There is an urgent need for clinically valuable, generalizable, and parsimonious triage tools assisting the appropriate allocation of hospital resources. We aimed to develop and extensively validate CODOP, a machine learning-based tool for accurately predicting the clinical outcome of hospitalized COVID-19 patients.
Methods
CODOP was built using modified stable iterative variable selection and linear regression with lasso regularisation. To avoid generalization problems, CODOP was trained and tested with three time-sliced and geographically distinct cohorts encompassing 40 511 blood-based analyses of COVID-19 patients from more than 110 hospitals in Spain and the USA during 2020-21. We assessed the discriminative ability of the model using the Area Under the Receiving Operative Curve (AUROC) as well as horizon and Kaplan-Meier risk stratification analyses. To reckon the fluctuating pressure levels in hospitals through the pandemic, we offer two online CODOP calculators suited for undertriage or overtriage scenarios. We challenged their generalizability and clinical utility throughout an evaluation on a cohort of patients hospitalized in five hospitals from three Latin American countries.
Findings
CODOP uses 12 clinical parameters commonly measured at hospital admission and associated with the pathophysiology of COVID-19. CODOP reaches high discriminative ability up to nine days before clinical resolution (AUROC: 0·90-0·96, 95% CI 0·879-0·970), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. The two CODOP online calculators demonstrate their potential for triage decisions when challenged with the distinctive Latin American evaluation cohorts (73-100% sensitivity and 84-100% specificity).
Interpretation
The high predictive performance of CODOP in geographically disperse patient cohorts and the easiness-of-use, strongly suggest its clinical utility as a global triage tool, particularly in resource-limited countries.
Funding
The Max Planck Society.
Research in context
Evidence before this study
We have searched PubMed for articles about the existence of in-hospital COVID-19 mortality predictive models, using the search terms “coronavirus”, “COVID-19”, “risk”, “death”, “mortality”, and “prediction”, focusing on studies published between March 1, 2020 and 31 August, 2021. The studies we identified generally used small-medium size cohorts of patients that are geographically restricted to small regions of the developed world (many times, to the same city). We haven’t found studies that challenged their models in extended cohorts of patients from very distinct health system populations, particularly from resource-limited countries. Further, most of the previous models are rigid by not acknowledging the fluctuating availability of hospital resources during the pandemic (e.g., beds, oxygen supply). These and other limitations have been pointed out by expert reviews indicating that published in-hospital COVID-19 mortality predictive models are subject to high risk of bias, report an over-optimistic performance, and have limited clinical value in assisting daily triage decisions. A parsimonious, accurate and extensively validated model is yet to be developed.
Added value of this study
We analysed clinical data from different cohorts totalling 21 607 COVID-19 patients treated in more than 110 hospitals in Spain and the USA during three different pandemic waves extending from February 2020 to April 2021. The new CODOP in-hospital mortality prediction model is based on 11 blood biochemistry parameters (representing main biological pathways involved in the pathogenesis of SARS-CoV-2) plus Age, all of them commonly measured upon hospitalization. CODOP accurately predicted mortality risk up to nine days before clinical resolution (AUROC: 0·90-0·96, 95% CI 0·879-0·970), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. We offer two online CODOP calculator subtypes (<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://gomezvarelalab.em.mpg.de/codop/">https://gomezvarelalab.em.mpg.de/codop/</ext-link>) tailored to overtriage and undertriage scenarios. The online calculators were able to reach the desired prediction performance in five independent evaluation cohorts gathered in hospitals of three Latin American countries from March 7th 2020 to June 7th 2021.
Implications of all the available evidence
We present here a highly accurate, parsimonious and extensively validated COVID-19 in-hospital mortality prediction model, derived from working with the largest number and the most geographically extended representation of patients and health systems to date.
The rigorous analytical methods, the generalizability of the model in distinct world regions, and its flexibility to reckon with the changing availability of hospital resources point to CODOP as a clinically useful tool potentially improving the outcome prediction and the management of COVID-19 hospitalized patients.
Related articles
Related articles are currently not available for this article.