AETHER Metacognition Adapter β Darwin-28B-Opus
A lightweight metacognition adapter (this is an adapter, not a fine-tune) for FINAL-Bench/Darwin-28B-Opus.
The base model's weights stay frozen and unchanged β the adapter only reads the base model's internal state to predict when the base model is about to make a mistake.
Platform & technology
Produced on VIDRAFT's Darwin / Chimera model-generation platform, with VIDRAFT's proprietary AETHER metacognition-emergence technology grafted on. The adapter surfaces a calibrated "am I about to be wrong?" signal that the base model's own confidence does not provide.
Why it matters
On free-form tasks a model's own confidence is a weak signal of correctness. This adapter recovers that signal, so a system can defer, double-check, or escalate exactly when the model is likely wrong.
Scores β AETHER Metacognition Benchmark (free-form, held-out)
| Metric | Value |
|---|---|
| Adapter gain (Ξ AUROC vs the model's own confidence) | -0.038 |
| Error-detection AUROC (adapter) | 0.337 |
| Base-confidence AUROC | 0.375 |
A positive gain means this adapter detects the model's errors better than the model's own confidence.
Usage
import torch, torch.nn as nn
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM, AutoTokenizer
BASE = "FINAL-Bench/Darwin-28B-Opus"
REPO = "FINAL-Bench/metacog-adapter-Darwin-28B-Opus"
tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(BASE, dtype="auto", device_map="auto").eval()
# Metacognition adapter = base model's last hidden state -> P(this answer is wrong). Base stays frozen.
d = model.config.hidden_size
adapter = nn.Sequential(nn.LayerNorm(d), nn.Linear(d, d // 4), nn.GELU(), nn.Dropout(0.1), nn.Linear(d // 4, 1))
adapter.load_state_dict(load_file(hf_hub_download(REPO, "adapter.safetensors")))
adapter.eval().to(model.device, dtype=torch.float32)
prompt = "..." # your question / task
ids = tok.apply_chat_template([{"role": "user", "content": prompt}],
return_tensors="pt", add_generation_prompt=True).to(model.device)
with torch.no_grad():
h = model(ids, output_hidden_states=True).hidden_states[-1][0, -1].float()
p_wrong = torch.sigmoid(adapter(h)).item()
print(f"P(model is about to be wrong) = {p_wrong:.3f}") # higher => defer / double-check / escalate
Files
adapter.safetensorsβ adapter weights (base model frozen)config.jsonβ metadata (base model, hidden size)
Links
- π Metacognition Leaderboard
- π Collection: AETHER Metacognition Adapters (FINAL-Bench)
Proprietary β "AETHER Metacognition Adapter (proprietary)". Weights released for evaluation; the training/inference method (VIDRAFT Darwin/Chimera platform + AETHER metacognition emergence) is proprietary and not included.
- Downloads last month
- 32
Model tree for FINAL-Bench/metacog-adapter-Darwin-28B-Opus
Base model
FINAL-Bench/Darwin-28B-Opus