Automating Epidemiology Report Generation from the MIMIC-IV Clinical Database using SNOMED CT and SQL

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Objective To present a unified and modular framework for automating the epidemiological research process from cohort definition to analysis and visualization using the MIMIC-IV dataset. Materials and Methods We combined SNOMED-CT ontologies, prompt-engineered SQL generation, and integration of structured and unstructured electronic health record data. Statistical summaries, logistic regression, and network-based co-word analyses were generated. Results The system successfully automated tasks such as cohort selection, ontology mapping, entity recognition, statistical analysis, and visualization. Applied to MIMIC-IV, the framework produced reproducible and interpretable epidemiological insights within hours, highlighting efficiency gains compared with manual workflows. Discussion Our approach demonstrates methodological advances by integrating knowledge engineering, NLP, and network analysis into a reproducible pipeline. The framework enables scalable, transparent, and efficient epidemiological research but remains limited by computational demands and variability in large language model–based SQL generation. Conclusion This modular pipeline illustrates a pathway toward automated, semantically grounded epidemiology reporting from EHRs, with potential applications in clinical and public health informatics.

Related articles

Related articles are currently not available for this article.