Agentic Generative Artificial Intelligence System for Classification of Pathology-Confirmed Primary Progressive Aphasia Variants
Abstract
Importance
Accurate clinical and pathological diagnoses are essential in neurodegenerative diseases, especially given the emergence of pathology-specific disease-modifying therapies. However, diagnostic accuracy remains challenging due to heterogeneous clinical presentations, complexity of integrating multimodal data, and limited access to multidisciplinary expertise. Primary Progressive Aphasia (PPA) exemplifies these challenges, requiring specialized clinical, neuropsychological, and imaging evaluations. Generative artificial intelligence (AI), powered by large language models, may offer scalable diagnostic support in this context.
Objective
To evaluate the diagnostic performance of an agentic generative AI system in classifying prototypical PPA cases by clinical syndrome and underlying pathology.
Design
Retrospective diagnostic validation study using a multi-agent generative AI architecture simulating expert-level reasoning.
Setting
Single tertiary academic referral center (University of California San Francisco, Memory and Aging Center).
Participants
Fifty-four individuals with a definite diagnosis of PPA and post-mortem confirmation (18 semantic [svPPA], 17 logopenic [lvPPA], 19 nonfluent [nfvPPA]), selected as prototypical cases with congruent clinical, imaging, and pathological profiles.
Exposure
Multimodal input data, including clinical notes, neuropsychological and language assessments, and MRI brain images, were processed through a multi-agent architecture. The system generated diagnostic predictions under two conditions: (1) open-ended diagnosis from a set of 15 neurodegenerative clinical syndromes; (2) constrained classification of PPA variant and underlying neuropathology.
Main Outcomes and Measures
Generative AI system diagnostic accuracy for clinical syndrome and pathology, based on expert clinical diagnoses and post-mortem confirmations as gold standard.
Results
In the open-ended setting, the system correctly identified PPA in 49 of 54 cases (90.7%, chance level=6.7%). When constrained to PPA, it achieved 100% accuracy for svPPA and nfvPPA, and 94.1% for lvPPA as primary prediction. Neuropathological predictions were most accurate for FTLD-TDP type C (100%) and FTLD-4R tau (100%), and high for Alzheimer’s disease (94.4%). The full diagnostic pipeline of all 54 cases was completed in under 10 minutes.
Conclusions and Relevance
The AI system demonstrated expert-level performance in classifying prototypical PPA cases, integrating multimodal data and mirroring specialist reasoning. Its speed and accuracy support its potential role in extending access to specialized diagnostic expertise, particularly in non-tertiary settings. Further validation in larger and more heterogeneous populations is warranted.
Related articles
Related articles are currently not available for this article.