Natural language processing in Alzheimer's disease research: Systematic review of methods, data, and efficacy
Peer reviewed, Journal article
Published version

View/ Open
Date
2025Metadata
Show full item recordCollections
Original version
Shakeri, A., & Farmanbar, M. (2025). Natural language processing in Alzheimer's disease research: Systematic review of methods, data, and efficacy. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring, 17(1), e70082. 10.1002/dad2.70082Abstract
INTRODUCTION
Alzheimer's disease (AD) prevalence is increasing, with no current cure. Natural language processing (NLP) offers the potential for non-invasive diagnostics, social burden assessment, and research advancements in AD.
METHOD
A systematic review using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines explored NLP applications in AD, focusing on dataset types, sources, research foci, methods, and effectiveness. Searches were conducted across six databases (ACM, Embase, IEEE, PubMed, Scopus, and Web of Science) from January 2020 to July 2024.
RESULTS
Of 1740 records, 79 studies were selected. Frequently used datasets included speech and electronic health records (EHR), along with social media and scientific publications. Machine learning and neural networks were primarily applied to speech, EHR, and social media data, while rule-based methods were used to analyze literature datasets.
DISCUSSION
NLP has proven effective in various aspects of AD research, including diagnosis, monitoring, social burden assessment, biomarker analysis, and research. However, there are opportunities for improvement in dataset diversity, model interpretability, multilingual capabilities, and addressing ethical concerns.
Highlights
This review systematically analyzed 79 studies from six major databases, focusing on the advancements and applications of natural language processing (NLP) in Alzheimer's disease (AD) research.
The study highlights the need for models focusing on remote monitoring of AD patients using speech analysis, offering a cost-effective alternative to traditional methods such as brain imaging and aiding clinicians in both prediagnosis and post-diagnosis periods.
The use of pretrained multilingual models is recommended to improve AD detection across different languages by leveraging diverse speech features and utilizing publicly available datasets.