Enhanced COVID-19 data for improved prediction of survival

This article has 1 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

The current COVID-19 pandemic, caused by the rapid world-wide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus effects individuals quite differently, with many infected patients showing only mild symptoms, and others showing critical illness. To lessen the impact of the pandemic, one important question is which factors predict the death of a patient? Here, we construct an enhanced COVID-19 dataset by processing two existing databases (from Kaggle and WHO) and using natural language processing methods to enhance the data by adding local weather conditions and research sentiment.

Author summary

In this study, we contribute an enhanced COVID-19 dataset, which contains 183 samples and 43 features. Application of Extreme Gradient Boosting (XGBoost) on the enhanced dataset achieves 95% accuracy in predicting patients survival, with country-wise research sentiment, and then age and local weather, showing the most importance. All data and source code are available at <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://ab.inf.uni-tuebingen.de/publications/papers/COVID-19">http://ab.inf.uni-tuebingen.de/publications/papers/COVID-19</ext-link>.

Related articles

Related articles are currently not available for this article.