Identifying Sequential Complication and Mortality Patterns in Diabetes Mellitus: Comparisons of Machine Learning Methodologies
Abstract
Background
Diabetes mellitus-related complications adversely affect the quality of life. Better risk-stratified care through mining of sequential complication patterns is needed to enable early detection and prevention.
Methods
Univariable and multivariate logistic regression was used to identify significant variables that can predict mortality. A sequence analysis method termed Prefixspan was applied to identify the most common couple, triple, quadruple, quintuple and sextuple sequential complication patterns in the directed comorbidity pathology network. A knowledge enhanced CPT+ (KCPT+) sequence prediction model is developed to predict the next possible outcome along the progression trajectories of diabetes-related complications.
Findings
A total of 14,144 diabetic patients (51% males) were included. Acute myocardial infarction (AMI) without known ischaemic heart disease (IHD) (odds ratio [OR]: 2.8, 95% CI: [2.3, 3.4]), peripheral vascular disease (OR: 2.3, 95% CI: [1.9, 2.8]), dementia (OR: 2.1, 95% CI: [1.8, 2.4]), and IHD with AMI (OR: 2.4, 95% CI: [2.1, 2.6]) are the most important multivariate predictors of mortality. KCPT+ shows high accuracy in predicting mortality (F1 score 0.90, ACU 0.88), osteoporosis (F1 score 0.86, AUC 0.82), ophthalmological complications (F1 score 0.82, AUC 0.82), IHD with AMI (F1 score 0.81, AUC 0.85) and neurological complications (F1 score 0.81, AUC 0.83) with a particular prior complication sequence.
Interpretation
Sequence analysis identifies the most common pattern characteristics of disease-related complications efficiently. The proposed sequence prediction model is accurate and enables clinicians to diagnose the next complication earlier, provide better risk-stratified care, and devise efficient treatment strategies for diabetes mellitus patients.
Related articles
Related articles are currently not available for this article.