Opus 4.8 Recreation – 1B Light Student (Baseline)
OpenMythos Recurrent-Depth Transformer (1B-scale student) trained as part of a governed reconstruction of Claude 4.8-class reasoning capabilities.
This is the baseline 1B-light student published under the GLASSEYE org (finetunedglasseye branding). A full Mythos Opus 4.8 rebuild (3B/70B/175B with Claude distillation) is a separate phase.
Training Run Summary
- Steps: 2,000
- Mode: Light (nuclear memory mode for A10G)
- Distillation: Disabled (
--no-use-distillation) - Data: FineWeb-Edu (
sample-10BT, streaming) - Hardware: Modal A10G (24 GB VRAM)
- Unroll during training: 1 loop (light mode)
- Architecture:
mythos_1bvariant with MoE replaced by dense Expert FFN
Key Architectural Features
- Recurrent-Depth Transformer (Prelude → Recurrent Block with Parcae LTI injection + MLA + ACT halting → Coda)
- LoRA depth adaptation
- Adaptive Computation Time (ACT) halting
- Multi-Latent Attention (MLA)
Loading the Model
Important: Weights were saved after a light-mode MoE→dense Expert swap. Apply the same swap before load_state_dict, or keys will not match.
import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from open_mythos import OpenMythos, MythosConfig
from open_mythos.main import Expert
REPO = "GLASSEYE/opus-4.8-recreation-1b-light"
cfg_path = hf_hub_download(REPO, "config.json")
weights_path = hf_hub_download(REPO, "model.safetensors")
with open(cfg_path) as f:
cfg = MythosConfig(**json.load(f))
model = OpenMythos(cfg)
# Light-mode MoE bypass (required)
for module in model.modules():
if hasattr(module, "block") and hasattr(module.block, "ffn"):
if "MoEFFN" in str(type(module.block.ffn)):
module.block.ffn = Expert(cfg.dim, cfg.dim * 4 // 3)
state_dict = load_file(weights_path)
model.load_state_dict(state_dict)
model.eval()
Alternative: load pytorch_model.bin with torch.load(..., weights_only=True) instead of safetensors.
Governance & Provenance
This model was produced under strict governance controls:
I_APPROVE_CLAUDE_4_8_DISTILLATIONapproval phrase + execution lock enforcement- Hard spend caps and pre-flight budget checks
- Full audit trail via
claude_teacher_recursive_trainer
Original training run: cyberviser/opus-4.8-recreation-1b-light. Republished baseline: GLASSEYE/opus-4.8-recreation-1b-light.
Important: This is an independent, open research reconstruction. It is not affiliated with, endorsed by, or sponsored by Anthropic.
Limitations
- 1B-scale student model (significantly smaller than the teacher)
- Trained in aggressive "light" mode with reduced sequence length and loop depth
- No distillation was used in this specific run
- Capabilities are intentionally bounded by the above constraints
Files
| File | Purpose |
|---|---|
config.json |
MythosConfig |
model.safetensors |
Weights (recommended) |
pytorch_model.bin |
Weights (torch.save state_dict) |
build_metadata.json |
Run provenance |
Links
- Training script:
training/final_opus_4_8_modal.py - Verification:
training/verify_opus_1b_light_load.py - Governance:
lab/policies/execution_lock.json,lab/policies/active_lab_threat_model.json
Generated by the ArtificialAutism / 0AI-CyberViser governed distillation pipeline.
- Downloads last month
- 97