You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Super-Linear: A Mixture of Experts Time Series Forecasting Model

SuperLinear is a novel time series forecasting model that employs a Mixture of Experts (MoE) architecture to achieve superior performance across various forecasting tasks. The model routes inputs to the most relevant experts based on frequency-domain analysis using FFT-based gating networks.

Model Architecture

The SuperLinear model consists of:

Sparse Mixture of Experts (MoE): Routes inputs to the top-k most relevant experts
FFT-based Gating Network: Uses frequency domain analysis to determine expert routing
Frequency-specific Experts: Pre-trained experts specialized for different temporal patterns

Key Features

Adaptive Expert Selection: Dynamic routing based on input characteristics
Frequency-aware Processing: Leverages FFT analysis for intelligent expert selection
Auto-regressive Capabilities: Supports long-horizon forecasting
Multi-scale Processing: Handles various sequence lengths through resampling

Usage

from transformers import AutoModelForCausalLM, AutoConfig
import torch

# Load the model
model = AutoModelForCausalLM.from_pretrained("SequentialLearning/SuperLinear", trust_remote_code=True)

# Prepare input time series data
# Shape: [batch_size, channel, sequence_length] or [batch_size, sequence_length]
input_data = torch.randn(1, 1, 512)

# Generate predictions
with torch.no_grad():
    outputs = model(inputs_embeds=input_data, pred_len=96, get_prob = True)
    preds = outputs.logits # Predicted values
    probs = outputs.attentions  # Expert probabilities stored here

Configuration

Key parameters:

train_seq_len: Training sequence length (default: 512)
train_pred_len: Training prediction length (default: 96)
top_k_experts: Number of experts to use (default: 12)
use_fft: Whether to use FFT-based gating (default: True)
freq_experts: Frequency-specific expert configuration
moe_temp: Temperature for expert selection during inference (default: 1)

Citation

If you use SuperLinear in your research, please cite:

@article{nochumsohn2025super,
  title={Super-Linear: A Lightweight Pretrained Mixture of Linear Experts for Time Series Forecasting},
  author={Nochumsohn, Liran and Marshanski, Raz and Zisling, Hedi and Azencot, Omri},
  journal={arXiv preprint arXiv:2509.15105},
  year={2025}
}

License

This model is released under the MIT License.

Downloads last month: 238

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

Metadata error: specify a dataset to view leaderboard

SequentialLearning
/

SuperLinear