Opus 4.8 Recreation – 1B Light Student (Baseline)

OpenMythos Recurrent-Depth Transformer (1B-scale student) trained as part of a governed reconstruction of Claude 4.8-class reasoning capabilities.

This is the baseline 1B-light student published under the GLASSEYE org (finetunedglasseye branding). A full Mythos Opus 4.8 rebuild (3B/70B/175B with Claude distillation) is a separate phase.

Training Run Summary

Steps: 2,000
Mode: Light (nuclear memory mode for A10G)
Distillation: Disabled (--no-use-distillation)
Data: FineWeb-Edu (sample-10BT, streaming)
Hardware: Modal A10G (24 GB VRAM)
Unroll during training: 1 loop (light mode)
Architecture: mythos_1b variant with MoE replaced by dense Expert FFN

Key Architectural Features

Recurrent-Depth Transformer (Prelude → Recurrent Block with Parcae LTI injection + MLA + ACT halting → Coda)
LoRA depth adaptation
Adaptive Computation Time (ACT) halting
Multi-Latent Attention (MLA)

Loading the Model

Important: Weights were saved after a light-mode MoE→dense Expert swap. Apply the same swap before load_state_dict, or keys will not match.

import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from open_mythos import OpenMythos, MythosConfig
from open_mythos.main import Expert

REPO = "GLASSEYE/opus-4.8-recreation-1b-light"

cfg_path = hf_hub_download(REPO, "config.json")
weights_path = hf_hub_download(REPO, "model.safetensors")

with open(cfg_path) as f:
    cfg = MythosConfig(**json.load(f))

model = OpenMythos(cfg)

# Light-mode MoE bypass (required)
for module in model.modules():
    if hasattr(module, "block") and hasattr(module.block, "ffn"):
        if "MoEFFN" in str(type(module.block.ffn)):
            module.block.ffn = Expert(cfg.dim, cfg.dim * 4 // 3)

state_dict = load_file(weights_path)
model.load_state_dict(state_dict)
model.eval()

Alternative: load pytorch_model.bin with torch.load(..., weights_only=True) instead of safetensors.

Governance & Provenance

This model was produced under strict governance controls:

I_APPROVE_CLAUDE_4_8_DISTILLATION approval phrase + execution lock enforcement
Hard spend caps and pre-flight budget checks
Full audit trail via claude_teacher_recursive_trainer

Original training run: cyberviser/opus-4.8-recreation-1b-light. Republished baseline: GLASSEYE/opus-4.8-recreation-1b-light.

Important: This is an independent, open research reconstruction. It is not affiliated with, endorsed by, or sponsored by Anthropic.

Limitations

1B-scale student model (significantly smaller than the teacher)
Trained in aggressive "light" mode with reduced sequence length and loop depth
No distillation was used in this specific run
Capabilities are intentionally bounded by the above constraints

Files

File	Purpose
`config.json`	MythosConfig
`model.safetensors`	Weights (recommended)
`pytorch_model.bin`	Weights (torch.save state_dict)
`build_metadata.json`	Run provenance

GLASSEYE
/

opus-4.8-recreation-1b-light