braindecode
/

USleep

+---
+license: bsd-3-clause
+library_name: braindecode
+pipeline_tag: feature-extraction
+tags:
+  - eeg
+  - biosignal
+  - pytorch
+  - neuroscience
+  - braindecode
+  - convolutional
+  - transformer
+  - sleep-staging
+---
+# USleep
+Sleep staging architecture from Perslev et al (2021) .
+> **Architecture-only repository.** This repo documents the
+> `braindecode.models.USleep` class. **No pretrained weights are
+> distributed here** — instantiate the model and train it on your own
+> data, or fine-tune from a published foundation-model checkpoint
+> separately.
+## Quick start
+```bash
+pip install braindecode
+```
+```python
+from braindecode.models import USleep
+model = USleep(
+    n_chans=2,
+    sfreq=100,
+    input_window_seconds=30.0,
+    n_outputs=5,
+)
+```
+The signal-shape arguments above are example defaults — adjust them
+to match your recording.
+## Documentation
+- Full API reference (parameters, references, architecture figure):
+  <https://braindecode.org/stable/generated/braindecode.models.USleep.html>
+- Interactive browser with live instantiation:
+  <https://huggingface.co/spaces/braindecode/model-explorer>
+- Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/usleep.py#L14>
+## Architecture description
+The block below is the rendered class docstring (parameters,
+references, architecture figure where available).
+<div class='bd-doc'><main>
+<p>Sleep staging architecture from Perslev et al (2021) <a class="brackets" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+<span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#5cb85c;color:white;font-size:11px;font-weight:600;margin-right:4px;">Convolution</span><figure class="align-center">
+<img alt="USleep Architecture" src="https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41746-021-00440-5/MediaObjects/41746_2021_440_Fig2_HTML.png" />
+<figcaption>
+<p>Figure: U-Sleep consists of an encoder (left) which encodes the input signals into dense feature representations, a decoder (middle) which projects
+the learned features into the input space to generate a dense sleep stage representation, and finally a specially designed segment
+classifier (right) which generates sleep stages at a chosen temporal resolution.</p>
+</figcaption>
+</figure>
+<p><strong>Architectural Overview</strong></p>
+<p>U-Sleep is a <strong>fully convolutional</strong>, feed-forward encoder-decoder with a <em>segment classifier</em> head for
+time-series <strong>segmentation</strong> (sleep staging). It maps multi-channel PSG (EEG+EOG) to a <em>dense, high-frequency</em>
+per-sample representation, then aggregates it into fixed-length stage labels (e.g., 30 s). The network
+processes arbitrarily long inputs in <strong>one forward pass</strong> (resampling to 128 Hz), allowing whole-night
+hypnograms in seconds.</p>
+<ul class="simple">
+<li><p>(i). :class:`_EncoderBlock` extracts progressively deeper temporal features at lower resolution;</p></li>
+<li><p>(ii). :class:`_Decoder` upsamples and fuses encoder features via U-Net-style skips to recover a per-sample stage map;</p></li>
+<li><p>(iii). Segment Classifier mean-pools over the target epoch length and applies two pointwise convs to yield
+per-epoch probabilities. Integrates into the USleep class.</p></li>
+</ul>
+<p><strong>Macro Components</strong></p>
+<ul>
+<li><p>Encoder :class:`_EncoderBlock` <strong>(multi-scale temporal feature extractor; downsampling x2 per block)</strong></p>
+<blockquote>
+<ul class="simple">
+<li><p><em>Operations.</em></p></li>
+<li><p><strong>Conv1d</strong> (:class:`torch.nn.Conv1d`) with kernel <span class="docutils literal">9</span> (stride <span class="docutils literal">1</span>, no dilation)</p></li>
+<li><p><strong>ELU</strong> (:class:`torch.nn.ELU`)</p></li>
+<li><p><strong>Batch Norm</strong> (:class:`torch.nn.BatchNorm1d`)</p></li>
+<li><p><strong>Max Pool 1d</strong>, :class:`torch.nn.MaxPool1d` (<span class="docutils literal">kernel=2, stride=2</span>).</p></li>
+</ul>
+<p>Filters grow with depth by a factor of <span class="docutils literal">sqrt(2)</span> (start <span class="docutils literal">c_1=5</span>); each block exposes a <strong>skip</strong>
+(pre-pooling activation) to the matching decoder block.
+<em>Role.</em> Slow, uniform downsampling preserves early information while expanding the effective temporal
+context over minutes—foundational for robust cross-cohort staging.</p>
+</blockquote>
+</li>
+</ul>
+<p>The number of filters grows with depth (capacity scaling); each block also exposes a <strong>skip</strong> (pre-pool)
+to the matching decoder block.</p>
+<dl class="simple">
+<dt><strong>Rationale.</strong></dt>
+<dd><ul class="simple">
+<li><p>Slow, uniform downsampling (x2 each level) preserves information in early layers while expanding the temporal receptive field over the minutes.</p></li>
+</ul>
+</dd>
+</dl>
+<ul>
+<li><p>Decoder :class:`_DecoderBlock`  <strong>(progressive upsampling + skip fusion to high-frequency map, 12 blocks; upsampling x2 per block)</strong></p>
+<blockquote>
+<ul>
+<li><p><em>Operations.</em></p>
+<blockquote>
+<ul class="simple">
+<li><p><strong>Nearest-neighbor upsample</strong>, :class:`nn.Upsample` (x2)</p></li>
+<li><p><strong>Convolution2d</strong> (k=2), :class:`torch.nn.Conv2d`</p></li>
+<li><p>ELU, :class:`torch.nn.ELU`</p></li>
+<li><p>Batch Norm, :class:`torch.nn.BatchNorm2d`</p></li>
+<li><p><strong>Concatenate</strong> with the encoder skip at the same temporal scale, <span class="docutils literal">torch.cat</span></p></li>
+<li><p><strong>Convolution</strong>, :class:`torch.nn.Conv2d`</p></li>
+<li><p>ELU, :class:`torch.nn.ELU`</p></li>
+<li><p>Batch Norm, :class:`torch.nn.BatchNorm2d`.</p></li>
+</ul>
+</blockquote>
+</li>
+</ul>
+</blockquote>
+</li>
+</ul>
+<p><strong>Output</strong>: A multi-class, <strong>high-frequency</strong> per-sample representation aligned to the input rate (128 Hz).</p>
+<ul>
+<li><p><strong>Segment Classifier incorporate into :class:`braindecode.models.USleep` (aggregation to fixed epochs)</strong></p>
+<blockquote>
+<ul>
+<li><p><em>Operations.</em></p>
+<blockquote>
+<ul class="simple">
+<li><p><strong>Mean-pool</strong>, :class:`torch.nn.AvgPool2d` per class with kernel = epoch length <em>i</em> and stride <em>i</em></p></li>
+<li><p><strong>1x1 conv</strong>, :class:`torch.nn.Conv2d`</p></li>
+<li><p>ELU, :class:`torch.nn.ELU`</p></li>
+<li><p><strong>1x1 conv</strong>, :class:`torch.nn.Conv2d` with <span class="docutils literal">(T, K)</span> (epochs x stages).</p></li>
+</ul>
+</blockquote>
+</li>
+</ul>
+</blockquote>
+</li>
+</ul>
+<p><strong>Role</strong>: Learns a <strong>non-linear</strong> weighted combination over each 30-s window (unlike U-Time's linear combiner).</p>
+<p><strong>Convolutional Details</strong></p>
+<ul>
+<li><p><strong>Temporal (where time-domain patterns are learned).</strong></p>
+<p>All convolutions are <strong>1-D along time</strong>; depth (12 levels) plus pooling yields an extensive receptive field
+(reported sensitivity to ±6.75 min around each epoch; theoretical field ≈ 9.6 min at the deepest layer).
+The decoder restores sample-level resolution before epoch aggregation.</p>
+</li>
+<li><p><strong>Spatial (how channels are processed).</strong></p>
+<p>Convolutions mix across the <em>channel</em> dimension jointly with time (no separate spatial operator). The system
+is <strong>montage-agnostic</strong> (any reasonable EEG/EOG pair) and was trained across diverse cohorts/protocols,
+supporting robustness to channel placement and hardware differences.</p>
+</li>
+<li><p><strong>Spectral (how frequency content is captured).</strong></p>
+<p>No explicit Fourier/wavelet transform is used; the <strong>stack of temporal convolutions</strong> acts as a learned
+filter bank whose effective bandwidth grows with depth. The high-frequency decoder output (128 Hz)
+retains fine temporal detail for the segment classifier.</p>
+</li>
+</ul>
+<p><strong>Attention / Sequential Modules</strong></p>
+<p>U-Sleep contains <strong>no attention or recurrent units</strong>; it is a <em>pure</em> feed-forward, fully convolutional
+segmentation network inspired by U-Net/U-Time, favoring training stability and cross-dataset portability.</p>
+<p><strong>Additional Mechanisms</strong></p>
+<ul class="simple">
+<li><p><strong>U-Net lineage with task-specific head.</strong> U-Sleep extends U-Time by being <strong>deeper</strong> (12 vs. 4 levels),
+switching ReLU→**ELU**, using uniform pooling (2) at all depths, and replacing the linear combiner with a
+<strong>two-layer</strong> pointwise head—improving capacity and resilience across datasets.</p></li>
+<li><p><strong>Arbitrary-length inference.</strong> Thanks to full convolutionality and tiling-free design, entire nights can be
+staged in a single pass on commodity hardware. Inputs shorter than ≈ 17.5 min may reduce performance by
+limiting long-range context.</p></li>
+<li><p><strong>Complexity scaling (alpha).</strong> Filter counts can be adjusted by a global <strong>complexity factor</strong> to trade accuracy
+and memory (as described in the paper's topology table).</p></li>
+</ul>
+<p><strong>Usage and Configuration</strong></p>
+<ul class="simple">
+<li><p><strong>Practice.</strong> Resample PSG to <strong>128 Hz</strong> and provide at least two channels (one EEG, one EOG). Choose epoch
+length <em>i</em> (often 30 s); ensure windows long enough to exploit the model's receptive field (e.g., training on
+≥ 17.5 min chunks).</p></li>
+</ul>
+<section id="parameters">
+<h2>Parameters</h2>
+<dl class="simple">
+<dt>n_chans<span class="classifier">int</span></dt>
+<dd><p>Number of EEG or EOG channels. Set to 2 in <a class="brackets" href="#footnote-1" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a> (1 EEG, 1 EOG).</p>
+</dd>
+<dt>sfreq<span class="classifier">float</span></dt>
+<dd><p>EEG sampling frequency. Set to 128 in <a class="brackets" href="#footnote-1" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+</dd>
+<dt>depth<span class="classifier">int</span></dt>
+<dd><p>Number of conv blocks in encoding layer (number of 2x2 max pools).
+Note: each block halves the spatial dimensions of the features.</p>
+</dd>
+<dt>n_time_filters<span class="classifier">int</span></dt>
+<dd><p>Initial number of convolutional filters. Set to 5 in <a class="brackets" href="#footnote-1" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+</dd>
+<dt>complexity_factor<span class="classifier">float</span></dt>
+<dd><p>Multiplicative factor for the number of channels at each layer of the U-Net.
+Set to 2 in <a class="brackets" href="#footnote-1" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+</dd>
+<dt>with_skip_connection<span class="classifier">bool</span></dt>
+<dd><p>If True, use skip connections in decoder blocks.</p>
+</dd>
+<dt>n_outputs<span class="classifier">int</span></dt>
+<dd><p>Number of outputs/classes. Set to 5.</p>
+</dd>
+<dt>input_window_seconds<span class="classifier">float</span></dt>
+<dd><p>Size of the input, in seconds. Set to 30 in <a class="brackets" href="#footnote-1" id="footnote-reference-6" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+</dd>
+<dt>time_conv_size_s<span class="classifier">float</span></dt>
+<dd><p>Size of the temporal convolution kernel, in seconds. Set to 9 / 128 in
+<a class="brackets" href="#footnote-1" id="footnote-reference-7" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>.</p>
+</dd>
+<dt>ensure_odd_conv_size<span class="classifier">bool</span></dt>
+<dd><p>If True and the size of the convolutional kernel is an even number, one
+will be added to it to ensure it is odd, so that the decoder blocks can
+work. This can be useful when using different sampling rates from 128
+or 100 Hz.</p>
+</dd>
+<dt>activation<span class="classifier">nn.Module, default=nn.ELU</span></dt>
+<dd><p>Activation function class to apply. Should be a PyTorch activation
+module class like <span class="docutils literal">nn.ReLU</span> or <span class="docutils literal">nn.ELU</span>. Default is <span class="docutils literal">nn.ELU</span>.</p>
+</dd>
+</dl>
+</section>
+<section id="references">
+<h2>References</h2>
+<aside class="footnote-list brackets">
+<aside class="footnote brackets" id="footnote-1" role="doc-footnote">
+<span class="label"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></span>
+<span class="backrefs">(<a role="doc-backlink" href="#footnote-reference-1">1</a>,<a role="doc-backlink" href="#footnote-reference-2">2</a>,<a role="doc-backlink" href="#footnote-reference-3">3</a>,<a role="doc-backlink" href="#footnote-reference-4">4</a>,<a role="doc-backlink" href="#footnote-reference-5">5</a>,<a role="doc-backlink" href="#footnote-reference-6">6</a>,<a role="doc-backlink" href="#footnote-reference-7">7</a>)</span>
+<p>Perslev M, Darkner S, Kempfner L, Nikolic M, Jennum PJ, Igel C.
+U-Sleep: resilient high-frequency sleep staging. <em>npj Digit. Med.</em> 4, 72 (2021).
+<a class="reference external" href="https://github.com/perslev/U-Time/blob/master/utime/models/usleep.py">https://github.com/perslev/U-Time/blob/master/utime/models/usleep.py</a></p>
+</aside>
+</aside>
+<p><strong>Hugging Face Hub integration</strong></p>
+<p>When the optional <span class="docutils literal">huggingface_hub</span> package is installed, all models
+automatically gain the ability to be pushed to and loaded from the
+Hugging Face Hub. Install with:</p>
+<pre class="literal-block">pip install braindecode[hub]</pre>
+<p><strong>Pushing a model to the Hub:</strong></p>
+<p><strong>Loading a model from the Hub:</strong></p>
+<p><strong>Extracting features and replacing the head:</strong></p>
+<p><strong>Saving and restoring full configuration:</strong></p>
+<p>All model parameters (both EEG-specific and model-specific such as
+dropout rates, activation functions, number of filters) are automatically
+saved to the Hub and restored when loading.</p>
+<p>See :ref:`load-pretrained-models` for a complete tutorial.</p>
+</section>
+</main>
+</div>
+## Citation
+Please cite both the original paper for this architecture (see the
+*References* section above) and braindecode:
+```bibtex
+@article{aristimunha2025braindecode,
+  title   = {Braindecode: a deep learning library for raw electrophysiological data},
+  author  = {Aristimunha, Bruno and others},
+  journal = {Zenodo},
+  year    = {2025},
+  doi     = {10.5281/zenodo.17699192},
+}
+```
+## License
+BSD-3-Clause for the model code (matching braindecode).
+Pretraining-derived weights, if you fine-tune from a checkpoint,
+inherit the licence of that checkpoint and its training corpus.