bruAristimunha commited on
Commit
519a9ea
·
verified ·
1 Parent(s): a22ba3d

Replace with clean markdown card

Browse files
Files changed (1) hide show
  1. README.md +34 -174
README.md CHANGED
@@ -16,11 +16,10 @@ tags:
16
 
17
  CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.
18
 
19
- > **Architecture-only repository.** This repo documents the
20
  > `braindecode.models.CodeBrain` class. **No pretrained weights are
21
- > distributed here** instantiate the model and train it on your own
22
- > data, or fine-tune from a published foundation-model checkpoint
23
- > separately.
24
 
25
  ## Quick start
26
 
@@ -39,187 +38,48 @@ model = CodeBrain(
39
  )
40
  ```
41
 
42
- The signal-shape arguments above are example defaults — adjust them
43
- to match your recording.
44
 
45
  ## Documentation
46
-
47
- - Full API reference (parameters, references, architecture figure):
48
- <https://braindecode.org/stable/generated/braindecode.models.CodeBrain.html>
49
- - Interactive browser with live instantiation:
50
  <https://huggingface.co/spaces/braindecode/model-explorer>
51
  - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/codebrain.py#L21>
52
 
53
- ## Architecture description
54
-
55
- The block below is the rendered class docstring (parameters,
56
- references, architecture figure where available).
57
-
58
- <div class='bd-doc'><main>
59
- <p>CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.</p>
60
- <span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#d9534f;color:white;font-size:11px;font-weight:600;margin-right:4px;">Foundation Model</span><span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#56B4E9;color:white;font-size:11px;font-weight:600;margin-right:4px;">Attention/Transformer</span>
61
-
62
-
63
-
64
- .. figure:: https://raw.githubusercontent.com/jingyingma01/CodeBrain/refs/heads/main/assets/intro.png
65
- :align: center
66
- :alt: CodeBrain pre-training overview
67
- :width: 1000px
68
-
69
- CodeBrain is a foundation model for EEG that pre-trains on large unlabelled
70
- corpora using a two-stage vector-quantised masking strategy, then fine-tunes
71
- on downstream BCI tasks. It segments EEG signals into fixed-size patches,
72
- embeds them with convolutional and spectral projections, and processes them
73
- through stacked residual blocks that combine a multi-scale convolutional
74
- structured state-space model (``_GConv``) with sliding-window self-attention.
75
-
76
- .. rubric:: Stage 2: EEGSSM Backbone (this implementation)
77
-
78
- This class implements Stage 2 of CodeBrain — the EEGSSM backbone described
79
- in Section 3.3 of [codebrain]_. Following :class:`Labram`, CodeBrain
80
- discretises EEG patches into codebook tokens via VQ-VAE (Stage 1, not
81
- implemented here), then trains the backbone to predict masked token indices
82
- via cross-entropy. CodeBrain extends this with a *dual* tokenizer that
83
- decouples temporal and frequency representations, as stated in the paper:
84
- *"the TFDual-Tokenizer, which decouples heterogeneous temporal and frequency
85
- EEG signals into discrete tokens to enhance discriminative power."*
86
-
87
- .. rubric:: Macro Components
88
-
89
- - **PatchEmbedding**: Splits ``(batch, n_chans, n_times)`` into
90
- ``(batch, n_chans, seq_len, patch_size)`` patches, projects each patch
91
- with a 2-D convolutional stack, adds FFT-based spectral embeddings, and
92
- applies depth-wise convolutional positional encoding.
93
- - **Residual blocks** (``ResidualGroup``): Each block applies RMSNorm,
94
- a ``_GConv`` SSM layer, and sliding-window multi-head attention, with
95
- gated activation and separate residual/skip paths.
96
- - **Classification head** (``final_layer``): Flattens the output and maps
97
- to ``n_outputs`` classes.
98
-
99
- .. important::
100
- **Pre-trained Weights Available**
101
-
102
- This model has pre-trained weights available on the Hugging Face Hub.
103
- You can load them using:
104
-
105
- .. code:: python
106
- from braindecode.models import CodeBrain
107
-
108
- # Load pre-trained model from Hugging Face Hub
109
- model = CodeBrain.from_pretrained("braindecode/codebrain-pretrained")
110
-
111
- To push your own trained model to the Hub:
112
-
113
- .. code:: python
114
- model.push_to_hub("my-username/my-codebrain")
115
-
116
- Parameters
117
- ----------
118
- patch_size : int, default=200
119
- Number of time samples per patch. Input length is trimmed to the
120
- nearest multiple of ``patch_size``.
121
- res_channels : int, default=200
122
- Width of the residual stream inside each ``ResidualBlock``.
123
- skip_channels : int, default=200
124
- Width of the skip-connection stream aggregated across blocks.
125
- out_channels : int, default=200
126
- Output channels of ``final_conv`` before the classification head.
127
- num_res_layers : int, default=8
128
- Number of stacked ``ResidualBlock`` modules.
129
- drop_prob : float, default=0.1
130
- Dropout rate used inside the ``_GConv`` SSM and attention layers.
131
- s4_bidirectional : bool, default=True
132
- Whether the ``_GConv`` SSM processes the sequence bidirectionally.
133
- s4_layernorm : bool, default=False
134
- Whether to apply layer normalisation inside the ``_GConv`` SSM.
135
- Set to ``False`` to match the released pretrained checkpoint.
136
- s4_lmax : int, default=570
137
- Maximum sequence length for the ``_GConv`` SSM kernel. Also determines
138
- the patch embedding dimension as ``s4_lmax // n_chans``.
139
- s4_d_state : int, default=64
140
- State dimension of the ``_GConv`` SSM.
141
- conv_out_chans : int, default=25
142
- Number of output channels in the patch projection convolutions.
143
- conv_groups : int, default=5
144
- Number of groups for ``GroupNorm`` in the patch projection.
145
- activation : type[nn.Module], default=nn.ReLU
146
- Non-linear activation class used in ``init_conv`` and ``final_conv``.
147
 
148
- References
149
- ----------
150
- .. [codebrain] Yi Ding, Xuyang Chen, Yong Li, Rui Yan, Tao Wang, Le Wu (2025).
151
- CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.
152
- https://arxiv.org/abs/2506.09110
153
-
154
- .. rubric:: Hugging Face Hub integration
155
-
156
- When the optional ``huggingface_hub`` package is installed, all models
157
- automatically gain the ability to be pushed to and loaded from the
158
- Hugging Face Hub. Install with::
159
-
160
- pip install braindecode[hub]
161
-
162
- **Pushing a model to the Hub:**
163
-
164
- .. code::
165
- from braindecode.models import CodeBrain
166
-
167
- # Train your model
168
- model = CodeBrain(n_chans=22, n_outputs=4, n_times=1000)
169
- # ... training code ...
170
-
171
- # Push to the Hub
172
- model.push_to_hub(
173
- repo_id="username/my-codebrain-model",
174
- commit_message="Initial model upload",
175
- )
176
-
177
- **Loading a model from the Hub:**
178
-
179
- .. code::
180
- from braindecode.models import CodeBrain
181
-
182
- # Load pretrained model
183
- model = CodeBrain.from_pretrained("username/my-codebrain-model")
184
-
185
- # Load with a different number of outputs (head is rebuilt automatically)
186
- model = CodeBrain.from_pretrained("username/my-codebrain-model", n_outputs=4)
187
-
188
- **Extracting features and replacing the head:**
189
-
190
- .. code::
191
- import torch
192
-
193
- x = torch.randn(1, model.n_chans, model.n_times)
194
- # Extract encoder features (consistent dict across all models)
195
- out = model(x, return_features=True)
196
- features = out["features"]
197
 
198
- # Replace the classification head
199
- model.reset_head(n_outputs=10)
200
 
201
- **Saving and restoring full configuration:**
202
-
203
- .. code::
204
- import json
205
-
206
- config = model.get_config() # all __init__ params
207
- with open("config.json", "w") as f:
208
- json.dump(config, f)
209
-
210
- model2 = CodeBrain.from_config(config) # reconstruct (no weights)
211
-
212
- All model parameters (both EEG-specific and model-specific such as
213
- dropout rates, activation functions, number of filters) are automatically
214
- saved to the Hub and restored when loading.
215
-
216
- See :ref:`load-pretrained-models` for a complete tutorial.</main>
217
- </div>
218
 
219
  ## Citation
220
 
221
- Please cite both the original paper for this architecture (see the
222
- *References* section above) and braindecode:
223
 
224
  ```bibtex
225
  @article{aristimunha2025braindecode,
 
16
 
17
  CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.
18
 
19
+ > **Architecture-only repository.** Documents the
20
  > `braindecode.models.CodeBrain` class. **No pretrained weights are
21
+ > distributed here.** Instantiate the model and train it on your own
22
+ > data.
 
23
 
24
  ## Quick start
25
 
 
38
  )
39
  ```
40
 
41
+ The signal-shape arguments above are illustrative defaults — adjust to
42
+ match your recording.
43
 
44
  ## Documentation
45
+ - Full API reference: <https://braindecode.org/stable/generated/braindecode.models.CodeBrain.html>
46
+ - Interactive browser (live instantiation, parameter counts):
 
 
47
  <https://huggingface.co/spaces/braindecode/model-explorer>
48
  - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/codebrain.py#L21>
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
+ ## Architecture
52
+
53
+ ![CodeBrain architecture](https://raw.githubusercontent.com/jingyingma01/CodeBrain/refs/heads/main/assets/intro.png)
54
+
55
+
56
+ ## Parameters
57
+
58
+ | Parameter | Type | Description |
59
+ |---|---|---|
60
+ | `patch_size` | int, default=200 | Number of time samples per patch. Input length is trimmed to the nearest multiple of `patch_size`. |
61
+ | `res_channels` | int, default=200 | Width of the residual stream inside each `ResidualBlock`. |
62
+ | `skip_channels` | int, default=200 | Width of the skip-connection stream aggregated across blocks. |
63
+ | `out_channels` | int, default=200 | Output channels of `final_conv` before the classification head. |
64
+ | `num_res_layers` | int, default=8 | Number of stacked `ResidualBlock` modules. |
65
+ | `drop_prob` | float, default=0.1 | Dropout rate used inside the `_GConv` SSM and attention layers. |
66
+ | `s4_bidirectional` | bool, default=True | Whether the `_GConv` SSM processes the sequence bidirectionally. |
67
+ | `s4_layernorm` | bool, default=False | Whether to apply layer normalisation inside the `_GConv` SSM. Set to `False` to match the released pretrained checkpoint. |
68
+ | `s4_lmax` | int, default=570 | Maximum sequence length for the `_GConv` SSM kernel. Also determines the patch embedding dimension as `s4_lmax // n_chans`. |
69
+ | `s4_d_state` | int, default=64 | State dimension of the `_GConv` SSM. |
70
+ | `conv_out_chans` | int, default=25 | Number of output channels in the patch projection convolutions. |
71
+ | `conv_groups` | int, default=5 | Number of groups for `GroupNorm` in the patch projection. |
72
+ | `activation` | type[nn.Module], default=nn.ReLU | Non-linear activation class used in `init_conv` and `final_conv`. |
73
+
74
+
75
+ ## References
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
 
77
+ 1. Yi Ding, Xuyang Chen, Yong Li, Rui Yan, Tao Wang, Le Wu (2025). CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks. https://arxiv.org/abs/2506.09110
 
78
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
 
80
  ## Citation
81
 
82
+ Cite the original architecture paper (see *References* above) and braindecode:
 
83
 
84
  ```bibtex
85
  @article{aristimunha2025braindecode,