Add pipeline tag and sample usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +171 -190
README.md CHANGED
@@ -1,29 +1,19 @@
1
  ---
2
- license: cc-by-nd-4.0
3
- language:
4
- - en
5
- library_name: pytorch
6
- tags:
7
- - eeg
8
- - biosignal
9
- - mamba
10
- - state-space-model
11
- - cross-attention
12
- - foundation-model
13
- - self-supervised
14
- - masked-modeling
15
- - lejepa
16
- - topology-invariant
17
- - neuroscience
18
  datasets:
19
  - TUEG
20
  - TUAB
 
 
21
  - APAVA
22
  - TDBrain
23
  - MoBI
24
  - SEED-V
25
  - Mumtaz2016
26
  - MODMA
 
 
 
 
27
  metrics:
28
  - balanced_accuracy
29
  - roc_auc
@@ -31,140 +21,152 @@ metrics:
31
  - r2
32
  - pearson_r
33
  - cohen_kappa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  thumbnail: https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png
35
  model-index:
36
- - name: LuMamba-Tiny (LeJEPA-reconstruction pre-training)
37
- results:
38
- - task:
39
- type: time-series-classification
40
- name: EEG Abnormality Detection
41
- dataset:
42
- type: TUAB
43
- name: TUH EEG Abnormal Corpus (TUAB)
44
- metrics:
45
- - type: balanced_accuracy
46
- value: 80.99
47
- name: Balanced Accuracy (%)
48
- - type: roc_auc
49
- value: 0.883
50
- name: AUROC
51
- - type: pr_auc
52
- value: 0.892
53
- name: AUC-PR
54
- - task:
55
- type: time-series-classification
56
- name: Alzheimer's Disease Detection
57
- dataset:
58
- type: APAVA
59
- name: APAVA
60
- metrics:
61
- - type: roc_auc
62
- value: 0.955
63
- name: AUROC
64
- - type: pr_auc
65
- value: 0.970
66
- name: AUC-PR
67
- - task:
68
- type: time-series-classification
69
- name: Parkinson's Disease Detection
70
- dataset:
71
- type: TDBrain
72
- name: TDBrain
73
- metrics:
74
- - type: roc_auc
75
- value: 0.961
76
- name: AUROC
77
- - type: pr_auc
78
- value: 0.960
79
- name: AUC-PR
80
- - task:
81
- type: time-series-classification
82
- name: Major Depressive Disorder Detection
83
- dataset:
84
- type: Mumtaz2016
85
- name: Mumtaz2016
86
- metrics:
87
- - type: roc_auc
88
- value: 0.931
89
- name: AUROC
90
- - type: pr_auc
91
- value: 0.952
92
- name: AUC-PR
93
- - name: LuMamba-Tiny (Reconstruction-only pre-training)
94
- results:
95
- - task:
96
- type: time-series-classification
97
- name: EEG Slowing Event and Seizure Detection
98
- dataset:
99
- type: TUSL
100
- name: TUH EEG Slowing Corpus (TUSL)
101
- metrics:
102
- - type: roc_auc
103
- value: 0.708
104
- name: AUROC
105
- - type: pr_auc
106
- value: 0.289
107
- name: AUC-PR
108
- - task:
109
- type: time-series-classification
110
- name: EEG Artifact Detection
111
- dataset:
112
- type: TUAR
113
- name: TUH EEG Artifact Corpus (TUAR)
114
- metrics:
115
- - type: roc_auc
116
- value: 0.914
117
- name: AUROC
118
- - type: pr_auc
119
- value: 0.510
120
- name: AUC-PR
121
- - task:
122
- type: time-series-classification
123
- name: Gait Prediction Regression
124
- dataset:
125
- type: MoBI
126
- name: MoBI
127
- metrics:
128
- - type: r2
129
- value: 0.116
130
- name: R-squared
131
- - type: rmse
132
- value: 0.1482
133
- name: Root Mean Squared Error
134
- - task:
135
- type: time-series-classification
136
- name: 5-class Emotion Detection
137
- dataset:
138
- type: SEED-V
139
- name: SEED-V
140
- metrics:
141
- - type: balanced_accuracy
142
- value: 35.0
143
- name: Balanced Accuracy (%)
144
- - type: cohen_kappa
145
- value: 0.191
146
- name: Cohen's Kappa
147
- - task:
148
- type: time-series-classification
149
- name: Major Depressive Disorder Detection
150
- dataset:
151
- type: MODMA
152
- name: MODMA
153
- metrics:
154
- - type: balanced_accuracy
155
- value: 59.5
156
- name: Balanced Accuracy (%)
157
- - type: roc_auc
158
- value: 0.448
159
- name: AUROC
160
- - type: pr_auc
161
- value: 0.420
162
- name: AUC-PR
163
  ---
 
164
  <div align="center">
165
  <img src="https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png" alt="LuMamba Logo" width="800"/>
166
- <h1>LuMamba: Latent Unified Mamba for Electrode
167
- Topology-Invariant and Efficient EEG Modeling</h1>
168
  </div>
169
  <p align="center">
170
  <a href="https://github.com/pulp-bio/BioFoundation">
@@ -184,6 +186,27 @@ LuMamba addresses varying channel layouts with **LUNA channel unification**, pro
184
 
185
  ---
186
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  ## 🔒 License & Usage Policy (Weights)
188
 
189
  **Weights license:** The released model weights are licensed under **Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0)**. This section summarizes the practical implications for users. *This is not legal advice; please read the full license text.*
@@ -214,7 +237,7 @@ We welcome community improvements via a **pull-request (PR)** workflow. If you b
214
  - **Goal:** Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
215
  - **Core idea:** **Channel-Unification Module** uses **learned queries** (Q) with **cross-attention** to map any set of channels to a fixed latent space. **bidirectional Mamba blocks** then operate on that latent sequence.
216
  - **Pre-training data:** TUEG, **>21,000 hours** of raw EEG; downstream subjects removed to avoid leakage.
217
- - **Downstream tasks:** **TUAB** (abnormal), **TUAR** (artifacts), **TUSL** (slowing), **SEED-V** (emotion; unseen 62-ch montage), **APAVA** (Alzheimer's disease; unseen 16-ch layout, **TDBrain** (Parkinson's disease; unseen 26-ch layout)
218
 
219
  ---
220
 
@@ -271,22 +294,6 @@ Larger model sizes can be attained by increasing the number of bi-Mamba blocks `
271
 
272
  ---
273
 
274
- ## 🔧 How to Use
275
-
276
- LuMamba weights are organized by pre-training configuration:
277
-
278
- - **`Reconstruction-only`** → variants pre-trained with masked reconstruction exclusively
279
- - **`LeJEPA-reconstruction`** → variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
280
- - **`LeJEPA-only`** → variant pre-trained with LeJEPA exclusively.
281
-
282
- All variants are pre-trained on TUEG.
283
-
284
- LuMamba experiments are categorized by two Hydra configurations, in `BioFoundation/config/experiments`:
285
- - **`LuMamba_finetune.yaml`** → configuration for fine-tuning experiments.
286
- - **`LuMamba_pretrain.yaml`** → configuration for pre-training experiments.
287
-
288
- ---
289
-
290
  ## 🔧 Fine-tuning — General Checklist
291
 
292
  0. **Install & read data prep**: clone the [BioFoundation repo](https://github.com/pulp-bio/BioFoundation), set up the environment as described there, then open `make_datasets/README.md` for dataset-specific notes (naming, expected folder layout, and common pitfalls).
@@ -305,13 +312,6 @@ LuMamba experiments are categorized by two Hydra configurations, in `BioFoundati
305
  6. **Trainer/optimizer**: adjust `gpus/devices`, `batch_size`, `max_epochs`, LR/scheduler if needed.
306
  7. **I/O**: set `io.base_output_path` and confirm `io.checkpoint_dirpath` exists.
307
 
308
-
309
- To launch fine-tuning (Hydra):
310
-
311
- ```bash
312
- python -u run_train.py +experiment=LuMamba_finetune
313
- ```
314
-
315
  ---
316
 
317
  ## ⚖️ Responsible AI, Risks & Biases
@@ -325,7 +325,7 @@ python -u run_train.py +experiment=LuMamba_finetune
325
  ## 🔗 Sources
326
 
327
  - **Code:** https://github.com/pulp-bio/BioFoundation
328
- - **Paper:** LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)
329
 
330
  ---
331
 
@@ -343,23 +343,4 @@ If you use LuMamba, please cite:
343
  primaryClass={cs.AI},
344
  url={https://arxiv.org/abs/2603.19100},
345
  }
346
- ```
347
-
348
- ---
349
-
350
- ## 🛠️ Maintenance & Contact
351
-
352
- - **Issues & support:** please open a GitHub issue in the BioFoundation repository.
353
-
354
- ---
355
- ---
356
-
357
- ## 🔗 Related Models
358
-
359
- - **[LUNA](https://huggingface.co/PulpBio/LUNA)** — Transformer-based topology-agnostic EEG foundation model (NeurIPS 2025). Source of the channel-unification cross-attention module that LuMamba reuses.
360
- - **[FEMBA](https://huggingface.co/PulpBio/FEMBA)** — Bidirectional Mamba foundation model for EEG. Source of the linear-complexity temporal backbone that LuMamba reuses.
361
- - **[TinyMyo](https://huggingface.co/PulpBio/TinyMyo)** — Tiny foundation model for flexible EMG signal processing at the edge.
362
-
363
- ## 🗒️ Changelog
364
-
365
- - **v1.0:** Initial release of LuMamba model card with task-specific checkpoints and instructions.
 
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  datasets:
3
  - TUEG
4
  - TUAB
5
+ - TUSL
6
+ - TUAR
7
  - APAVA
8
  - TDBrain
9
  - MoBI
10
  - SEED-V
11
  - Mumtaz2016
12
  - MODMA
13
+ language:
14
+ - en
15
+ library_name: pytorch
16
+ license: cc-by-nd-4.0
17
  metrics:
18
  - balanced_accuracy
19
  - roc_auc
 
21
  - r2
22
  - pearson_r
23
  - cohen_kappa
24
+ - rmse
25
+ pipeline_tag: other
26
+ tags:
27
+ - eeg
28
+ - biosignal
29
+ - mamba
30
+ - state-space-model
31
+ - cross-attention
32
+ - foundation-model
33
+ - self-supervised
34
+ - masked-modeling
35
+ - lejepa
36
+ - topology-invariant
37
+ - neuroscience
38
  thumbnail: https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png
39
  model-index:
40
+ - name: LuMamba-Tiny (Reconstruction-only pre-training)
41
+ results:
42
+ - task:
43
+ type: time-series-classification
44
+ name: EEG Abnormality Detection
45
+ dataset:
46
+ name: TUH EEG Abnormal Corpus (TUAB)
47
+ type: TUAB
48
+ metrics:
49
+ - type: balanced_accuracy
50
+ value: 80.99
51
+ name: Balanced Accuracy (%)
52
+ - type: roc_auc
53
+ value: 0.883
54
+ name: AUROC
55
+ - type: pr_auc
56
+ value: 0.892
57
+ name: AUC-PR
58
+ - task:
59
+ type: time-series-classification
60
+ name: Alzheimer's Disease Detection
61
+ dataset:
62
+ name: APAVA
63
+ type: APAVA
64
+ metrics:
65
+ - type: roc_auc
66
+ value: 0.955
67
+ name: AUROC
68
+ - type: pr_auc
69
+ value: 0.97
70
+ name: AUC-PR
71
+ - task:
72
+ type: time-series-classification
73
+ name: Parkinson's Disease Detection
74
+ dataset:
75
+ name: TDBrain
76
+ type: TDBrain
77
+ metrics:
78
+ - type: roc_auc
79
+ value: 0.961
80
+ name: AUROC
81
+ - type: pr_auc
82
+ value: 0.96
83
+ name: AUC-PR
84
+ - task:
85
+ type: time-series-classification
86
+ name: Major Depressive Disorder Detection
87
+ dataset:
88
+ name: Mumtaz2016
89
+ type: Mumtaz2016
90
+ metrics:
91
+ - type: roc_auc
92
+ value: 0.931
93
+ name: AUROC
94
+ - type: pr_auc
95
+ value: 0.952
96
+ name: AUC-PR
97
+ - task:
98
+ type: time-series-classification
99
+ name: EEG Slowing Event and Seizure Detection
100
+ dataset:
101
+ name: TUH EEG Slowing Corpus (TUSL)
102
+ type: TUSL
103
+ metrics:
104
+ - type: roc_auc
105
+ value: 0.708
106
+ name: AUROC
107
+ - type: pr_auc
108
+ value: 0.289
109
+ name: AUC-PR
110
+ - task:
111
+ type: time-series-classification
112
+ name: EEG Artifact Detection
113
+ dataset:
114
+ name: TUH EEG Artifact Corpus (TUAR)
115
+ type: TUAR
116
+ metrics:
117
+ - type: roc_auc
118
+ value: 0.914
119
+ name: AUROC
120
+ - type: pr_auc
121
+ value: 0.51
122
+ name: AUC-PR
123
+ - task:
124
+ type: time-series-classification
125
+ name: Gait Prediction Regression
126
+ dataset:
127
+ name: MoBI
128
+ type: MoBI
129
+ metrics:
130
+ - type: r2
131
+ value: 0.116
132
+ name: R-squared
133
+ - type: rmse
134
+ value: 0.1482
135
+ name: Root Mean Squared Error
136
+ - task:
137
+ type: time-series-classification
138
+ name: 5-class Emotion Detection
139
+ dataset:
140
+ name: SEED-V
141
+ type: SEED-V
142
+ metrics:
143
+ - type: balanced_accuracy
144
+ value: 35.0
145
+ name: Balanced Accuracy (%)
146
+ - type: cohen_kappa
147
+ value: 0.191
148
+ name: Cohen's Kappa
149
+ - task:
150
+ type: time-series-classification
151
+ name: Major Depressive Disorder Detection
152
+ dataset:
153
+ name: MODMA
154
+ type: MODMA
155
+ metrics:
156
+ - type: balanced_accuracy
157
+ value: 59.5
158
+ name: Balanced Accuracy (%)
159
+ - type: roc_auc
160
+ value: 0.448
161
+ name: AUROC
162
+ - type: pr_auc
163
+ value: 0.42
164
+ name: AUC-PR
 
 
165
  ---
166
+
167
  <div align="center">
168
  <img src="https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png" alt="LuMamba Logo" width="800"/>
169
+ <h1>LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling</h1>
 
170
  </div>
171
  <p align="center">
172
  <a href="https://github.com/pulp-bio/BioFoundation">
 
186
 
187
  ---
188
 
189
+ ## 🔧 Sample Usage
190
+
191
+ ### Download Weights
192
+ You can download all pre-trained variants and safetensors programmatically using `huggingface_hub`:
193
+
194
+ ```python
195
+ from huggingface_hub import snapshot_download
196
+
197
+ # downloads all pre-trained variants and safetensors into ./checkpoints/LuMamba
198
+ snapshot_download(repo_id="PulpBio/LuMamba", repo_type="model", local_dir="checkpoints/LuMamba")
199
+ ```
200
+
201
+ ### Fine-tuning
202
+ Include the safetensors checkpoint path as input and run fine-tuning in the command line:
203
+ ```bash
204
+ python -u run_train.py +experiment=LuMamba_finetune \
205
+ pretrained_safetensors_path=/absolute/path/to/checkpoints/LuMamba/LuMamba.safetensors
206
+ ```
207
+
208
+ ---
209
+
210
  ## 🔒 License & Usage Policy (Weights)
211
 
212
  **Weights license:** The released model weights are licensed under **Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0)**. This section summarizes the practical implications for users. *This is not legal advice; please read the full license text.*
 
237
  - **Goal:** Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
238
  - **Core idea:** **Channel-Unification Module** uses **learned queries** (Q) with **cross-attention** to map any set of channels to a fixed latent space. **bidirectional Mamba blocks** then operate on that latent sequence.
239
  - **Pre-training data:** TUEG, **>21,000 hours** of raw EEG; downstream subjects removed to avoid leakage.
240
+ - **Downstream tasks:** **TUAB** (abnormal), **TUAR** (artifacts), **TUSL** (slowing), **SEED-V** (emotion; unseen 62-ch montage), **APAVA** (Alzheimer's disease; unseen 16-ch layout), **TDBrain** (Parkinson's disease; unseen 26-ch layout)
241
 
242
  ---
243
 
 
294
 
295
  ---
296
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
297
  ## 🔧 Fine-tuning — General Checklist
298
 
299
  0. **Install & read data prep**: clone the [BioFoundation repo](https://github.com/pulp-bio/BioFoundation), set up the environment as described there, then open `make_datasets/README.md` for dataset-specific notes (naming, expected folder layout, and common pitfalls).
 
312
  6. **Trainer/optimizer**: adjust `gpus/devices`, `batch_size`, `max_epochs`, LR/scheduler if needed.
313
  7. **I/O**: set `io.base_output_path` and confirm `io.checkpoint_dirpath` exists.
314
 
 
 
 
 
 
 
 
315
  ---
316
 
317
  ## ⚖️ Responsible AI, Risks & Biases
 
325
  ## 🔗 Sources
326
 
327
  - **Code:** https://github.com/pulp-bio/BioFoundation
328
+ - **Paper:** [LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling](https://arxiv.org/abs/2603.19100)
329
 
330
  ---
331
 
 
343
  primaryClass={cs.AI},
344
  url={https://arxiv.org/abs/2603.19100},
345
  }
346
+ ```