AETHER Metacognition Adapter β€” Darwin-28B-Opus

A lightweight metacognition adapter (this is an adapter, not a fine-tune) for FINAL-Bench/Darwin-28B-Opus. The base model's weights stay frozen and unchanged β€” the adapter only reads the base model's internal state to predict when the base model is about to make a mistake.

Platform & technology

Produced on VIDRAFT's Darwin / Chimera model-generation platform, with VIDRAFT's proprietary AETHER metacognition-emergence technology grafted on. The adapter surfaces a calibrated "am I about to be wrong?" signal that the base model's own confidence does not provide.

Why it matters

On free-form tasks a model's own confidence is a weak signal of correctness. This adapter recovers that signal, so a system can defer, double-check, or escalate exactly when the model is likely wrong.

Scores β€” AETHER Metacognition Benchmark (free-form, held-out)

Metric Value
Adapter gain (Ξ” AUROC vs the model's own confidence) -0.038
Error-detection AUROC (adapter) 0.337
Base-confidence AUROC 0.375

A positive gain means this adapter detects the model's errors better than the model's own confidence.

Usage

import torch, torch.nn as nn
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM, AutoTokenizer

BASE = "FINAL-Bench/Darwin-28B-Opus"
REPO = "FINAL-Bench/metacog-adapter-Darwin-28B-Opus"

tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(BASE, dtype="auto", device_map="auto").eval()

# Metacognition adapter = base model's last hidden state -> P(this answer is wrong). Base stays frozen.
d = model.config.hidden_size
adapter = nn.Sequential(nn.LayerNorm(d), nn.Linear(d, d // 4), nn.GELU(), nn.Dropout(0.1), nn.Linear(d // 4, 1))
adapter.load_state_dict(load_file(hf_hub_download(REPO, "adapter.safetensors")))
adapter.eval().to(model.device, dtype=torch.float32)

prompt = "..."  # your question / task
ids = tok.apply_chat_template([{"role": "user", "content": prompt}],
                              return_tensors="pt", add_generation_prompt=True).to(model.device)
with torch.no_grad():
    h = model(ids, output_hidden_states=True).hidden_states[-1][0, -1].float()
    p_wrong = torch.sigmoid(adapter(h)).item()
print(f"P(model is about to be wrong) = {p_wrong:.3f}")   # higher => defer / double-check / escalate

Files

  • adapter.safetensors β€” adapter weights (base model frozen)
  • config.json β€” metadata (base model, hidden size)

Links


Proprietary β€” "AETHER Metacognition Adapter (proprietary)". Weights released for evaluation; the training/inference method (VIDRAFT Darwin/Chimera platform + AETHER metacognition emergence) is proprietary and not included.

Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FINAL-Bench/metacog-adapter-Darwin-28B-Opus

Adapter
(1)
this model

Collection including FINAL-Bench/metacog-adapter-Darwin-28B-Opus