Early Stage Prediction of US County Vulnerability to the COVID-19 Pandemic
Abstract
Importance
The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to rapidly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread.
Objective
Developing county level prediction around near future disease movement for COVID-19 occurrences using publicly available data.
Design
Original Investigation; Decision Analytical Model Study for County Level COVID-19 occurrences using data from March 14-31, 2020.
Setting
Disease spread prediction for US counties.
Participants
All US county level granularity based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features.
Exposure(s) (for observational studies)
Daily county level reported COVID-19 occurrences from March 14-31, 2020.
Main Outcome(s) and Measure(s)
We developed a 3-stage model to quantify, firstly the probability of COVID-19 occurrence for unaffected counties using XGBoost classifier and secondly, the number of potential occurrences of a county via XGBoost regression. Thirdly, these results are combined to compute the county level risk. This risk is then used as an estimated after-five-day-vulnerability of the county.
Results
Using data from March 14-31, 2020, the model shows a sensitivity over 71.5% and specificity over 94%.
Conclusions and Relevance
We found that population, population density, percentage of people aged 70 or greater and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We found a positive association between affected and urban counties as well as less vulnerable and rural counties. The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduces significant variation in reported cases and produces a bias in the model.
Trial Registration
Not Applicable
Key Points
Question
What are key factors that define the vulnerability of counties in the US to cases of the COVID-19 virus?
Findings
In this epidemiological study based on publicly available data, we develop a model that predicts vulnerability to COVID-19 for each US county in terms of likelihood of going from no documented cases to at least one case within five days and in terms of number of occurrences of the virus.
Meaning
Predicting county vulnerability to COVID-19 can assist health organizations to better plan for resource and workforce needs.
Related articles
Related articles are currently not available for this article.