LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection
Electroencephalogram (EEG) provides a non-invasive, highly accessible, and cost-effective solution for Alzheimer's Disease (AD) detection. However, existing methods, whether based on manual feature extraction or deep learning, face two major challenges: the lack of large-scale datasets for robu...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
01.02.2025
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2502.01678 |
Cover
Summary: | Electroencephalogram (EEG) provides a non-invasive, highly accessible, and
cost-effective solution for Alzheimer's Disease (AD) detection. However,
existing methods, whether based on manual feature extraction or deep learning,
face two major challenges: the lack of large-scale datasets for robust feature
learning and evaluation, and poor detection performance due to inter-subject
variations. To address these challenges, we curate an EEG-AD corpus containing
813 subjects, which forms the world's largest EEG-AD dataset to the best of our
knowledge. Using this unique dataset, we propose LEAD, the first large
foundation model for EEG-based AD detection. Our method encompasses an entire
pipeline, from data selection and preprocessing to self-supervised contrastive
pretraining, fine-tuning, and key setups such as subject-independent evaluation
and majority voting for subject-level detection. We pre-train the model on 11
EEG datasets and unified fine-tune it on 5 AD datasets. Our self-supervised
pre-training design includes sample-level and subject-level contrasting to
extract useful general EEG features. Fine-tuning is performed on 5
channel-aligned datasets together. The backbone encoder incorporates temporal
and channel embeddings to capture features across both temporal and spatial
dimensions. Our method demonstrates outstanding AD detection performance,
achieving up to a 9.86% increase in F1 score at the sample-level and up to a
9.31% at the subject-level compared to state-of-the-art methods. The results of
our model strongly confirm the effectiveness of contrastive pre-training and
channel-aligned unified fine-tuning for addressing inter-subject variation. The
source code is at https://github.com/DL4mHealth/LEAD. |
---|---|
DOI: | 10.48550/arxiv.2502.01678 |