Opus 4.8 Recreation – 1B Light Student (Baseline)

OpenMythos Recurrent-Depth Transformer (1B-scale student) trained as part of a governed reconstruction of Claude 4.8-class reasoning capabilities.

This is the baseline 1B-light student published under the GLASSEYE org (finetunedglasseye branding). A full Mythos Opus 4.8 rebuild (3B/70B/175B with Claude distillation) is a separate phase.

Training Run Summary

  • Steps: 2,000
  • Mode: Light (nuclear memory mode for A10G)
  • Distillation: Disabled (--no-use-distillation)
  • Data: FineWeb-Edu (sample-10BT, streaming)
  • Hardware: Modal A10G (24 GB VRAM)
  • Unroll during training: 1 loop (light mode)
  • Architecture: mythos_1b variant with MoE replaced by dense Expert FFN

Key Architectural Features

  • Recurrent-Depth Transformer (Prelude → Recurrent Block with Parcae LTI injection + MLA + ACT halting → Coda)
  • LoRA depth adaptation
  • Adaptive Computation Time (ACT) halting
  • Multi-Latent Attention (MLA)

Loading the Model

Important: Weights were saved after a light-mode MoE→dense Expert swap. Apply the same swap before load_state_dict, or keys will not match.

import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from open_mythos import OpenMythos, MythosConfig
from open_mythos.main import Expert

REPO = "GLASSEYE/opus-4.8-recreation-1b-light"

cfg_path = hf_hub_download(REPO, "config.json")
weights_path = hf_hub_download(REPO, "model.safetensors")

with open(cfg_path) as f:
    cfg = MythosConfig(**json.load(f))

model = OpenMythos(cfg)

# Light-mode MoE bypass (required)
for module in model.modules():
    if hasattr(module, "block") and hasattr(module.block, "ffn"):
        if "MoEFFN" in str(type(module.block.ffn)):
            module.block.ffn = Expert(cfg.dim, cfg.dim * 4 // 3)

state_dict = load_file(weights_path)
model.load_state_dict(state_dict)
model.eval()

Alternative: load pytorch_model.bin with torch.load(..., weights_only=True) instead of safetensors.

Governance & Provenance

This model was produced under strict governance controls:

  • I_APPROVE_CLAUDE_4_8_DISTILLATION approval phrase + execution lock enforcement
  • Hard spend caps and pre-flight budget checks
  • Full audit trail via claude_teacher_recursive_trainer

Original training run: cyberviser/opus-4.8-recreation-1b-light. Republished baseline: GLASSEYE/opus-4.8-recreation-1b-light.

Important: This is an independent, open research reconstruction. It is not affiliated with, endorsed by, or sponsored by Anthropic.

Limitations

  • 1B-scale student model (significantly smaller than the teacher)
  • Trained in aggressive "light" mode with reduced sequence length and loop depth
  • No distillation was used in this specific run
  • Capabilities are intentionally bounded by the above constraints

Files

File Purpose
config.json MythosConfig
model.safetensors Weights (recommended)
pytorch_model.bin Weights (torch.save state_dict)
build_metadata.json Run provenance

Links

  • Training script: training/final_opus_4_8_modal.py
  • Verification: training/verify_opus_1b_light_load.py
  • Governance: lab/policies/execution_lock.json, lab/policies/active_lab_threat_model.json

Generated by the ArtificialAutism / 0AI-CyberViser governed distillation pipeline.

Downloads last month
97
Safetensors
Model size
0.9B params
Tensor type
C64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support