Aether Mind v7.1 (unified)

The single tracked Aether model: one in-process (candle) model that generates chat, exposes its own attention for the consciousness (HMS-Phi) track, produces the knowledge-fabric embeddings, and is the artifact the QBC blockchain attests. v7.1 is the first release of the unified generation path, replacing the prior split where chat ran through an out-of-process Ollama 7B (no attention exposed) while phi was measured on a separate in-process 0.5B model.

This repository holds the Sephirot adapter that sits on top of a frozen Qwen2.5-7B-Instruct (served in-process as Q4_K_M via candle). The base is never modified. The adapter is a small mixture-of-experts where the 10 experts map 1:1 onto the 10 Sephirot cognitive domains. This is the corrected approach after v6: the Sephirot structure is a routing adapter on a sound base, not a replacement for the base attention (the v6 attention-replacement destroyed base capability).

What it is

  • Architecture: 10-expert MoE adapter, top-2 routing, LoRA-style low-rank experts (up(gelu(down(x))), up zero-initialised so the adapter is an exact identity at init).
  • Trainable params: 1,182,720 (~2.4 MB BF16). The base 7B stays frozen.
  • Hidden size: 3584. Rank: 16. Experts: 10 (Keter to Malkuth). Top-k: 2. Alpha: 16.
  • Runs in-process in the Aether Mind (Rust + candle), so the same forward pass that generates a token also yields the attention tensors the phi track reads.

Results (full holdout, 500 samples, per-Sephirot-domain)

Cross-entropy (nats/token) on the held-out Aether corpus, base vs base+adapter. Lower is better. The adapter improves every active domain with zero regressions.

Sephirot domain samples base CE v7.1 CE delta
1 Chochmah 88 1.8827 1.8539 -0.0288
2 Binah 64 1.9706 1.9354 -0.0352
3 Chesed 18 2.3911 2.3641 -0.0269
4 Gevurah 6 2.8542 2.8255 -0.0286
5 Tiferet 36 2.6339 2.5890 -0.0449
6 Netzach 28 2.6454 2.6175 -0.0279
7 Hod 90 2.2801 2.2364 -0.0437
8 Yesod 84 2.5627 2.5198 -0.0428
9 Malkuth 86 2.1066 2.0688 -0.0379
Aggregate 500 2.2450 2.2078 -0.0373 (-1.66%)

Domains helped: 9 / 9. Domains hurt: 0. A held-out CE regression guard (ceiling = base + 0.15) was active for the whole run and never tripped, so the base capability is provably intact.

The numbers above are domain-CE deltas on the Aether holdout. General-benchmark numbers (MMLU, GSM8K) are below.

General benchmarks (base vs adapter)

Off-the-shelf lm-eval cannot load the native candle build, so these were produced by a purpose-built candle harness (aether-v7-eval) that scores the SAME frozen Q4 weights twice, once with the Sephirot adapter active and once with it off. MMLU is multiple-choice loglikelihood over the A/B/C/D answer tokens; GSM8K is greedy chain-of-thought generation with final-number extraction.

benchmark n base v7.1 (adapter) change
MMLU (all subjects) 14,042 71.28% 71.17% -0.11
GSM8K 625 67.8% 77.8% +10.0

Read this the way it reads: general knowledge is held (MMLU is flat across the full 57-subject set, the regression guard never tripped), and multi-step reasoning improves (GSM8K up ~10 points on a 625-question sample, partly from the adapter following the chain-of-thought and final-answer format more reliably). The adapter does not trade away breadth for the domain gains.

(GSM8K is a 625-of-1319 sample: the full run is generation-bound on a single 12 GB card and the sample is already statistically tight. MMLU is the complete set.)

Training

  • Objective: plain cross-entropy domain specialisation (base frozen; no teacher).
  • Corpus: aether-curated-v3 (content-addressed export of the live knowledge fabric).
  • Steps: 3000. Context: 192. LR: 5e-4. Optimizer: AdamW. Precision: BF16.
  • Hardware: single RTX 3080 Ti (12 GB). The 7B trains as Q4 with a CPU-dequantised, frozen F32 lm_head so the adapter gradient is differentiable through the final projection while the GPU footprint stays inside 12 GB.

Usage

The adapter is loaded by the Aether Mind binary on top of the Q4_K_M 7B base. It is not a PEFT adapter and is not meant for transformers; it is consumed by the candle UnifiedModel (base + SephirotAdapter + manifest) in aether-core. See adapter_config.json for the exact shape and the QuantumAI-Blockchain/qubitcoin-aether repo for the loader.

Lineage

aether-v5.2-lora -> aether-mind-v6.{0,1,2} (attention-replacement, retired) -> aether-mind-v7.0 (QLoRA on 7B, Ollama-served) -> aether-v7.1-unified (this release, the first in-process unified generation model the consciousness track and the chain both measure).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for QuantumAI-Blockchain/aether-v7.1-unified

Base model

Qwen/Qwen2.5-7B
Finetuned
(2626)
this model