HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 67.2k • 728
A trained geometric sidecar checkpoint for the Dual-System Architecture.
The sidecar (162M params) adds structured reasoning on top of the frozen Gemma 4 E2B-IT backbone:
| Component | Description | Params |
|---|---|---|
| GeometricProcessor | 4-layer causal transformer with KV caching producing additive geo_logits |
~148M |
| LatentPlanner | VAE with LaDiR-style diffusion ELBO for planning latent zâ‚€ | ~14M |
| EBM Critic | Energy-based model scoring geometric sequence quality | ~0.5M |
| Alpha Gate | Learned sigmoid gate (α=0.537) blending sidecar corrections | 1 |
Forward pass: final_logits = base_logits + α · geo_logits
google/gemma-4-E2B-it (frozen, ~2.6B params)# Install
!pip install transformers accelerate huggingface_hub torch
# Clone the repo
!git clone https://github.com/Bender1011001/dual-system-architecture.git
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from dual_system_v2 import DualSystemV2, SidecarConfig
from huggingface_hub import hf_hub_download
# Download sidecar
sidecar_path = hf_hub_download(
repo_id="Bender1011001/gemma4-dualsystem-sidecar",
filename="sidecar_epoch2.pt"
)
# Load checkpoint config (guarantees weight compatibility)
ckpt = torch.load(sidecar_path, map_location="cuda", weights_only=False)
config = SidecarConfig(**ckpt["config"])
# Load backbone
backbone = AutoModelForCausalLM.from_pretrained(
ckpt["backbone"], torch_dtype=torch.bfloat16, device_map="cuda"
)
for p in backbone.parameters():
p.requires_grad = False
# Build and load sidecar
model = DualSystemV2(backbone=backbone, config=config).cuda().eval()
model.geo_processor.load_state_dict(ckpt["geo_state"])
model.ebm_critic.load_state_dict(ckpt["ebm_state"])
model.latent_planner.load_state_dict(ckpt["planner_state"])
# Generate
tokenizer = AutoTokenizer.from_pretrained(ckpt["backbone"])
result = model(input_ids=tokenizer("Hello", return_tensors="pt").input_ids.cuda())
@misc{dual-system-2026,
title={Dual-System Architecture: Geometric Sidecar Modules for Language Model Enhancement},
author={Bender1011001},
year={2026},
url={https://github.com/Bender1011001/dual-system-architecture}
}