SYNAXIM

Magnus — SYNAXIM .symb Format (INT4)

Powered by SYNAXIM — Symbiotic Native Axiom Inference Machine
Framework-Free LLM Inference | Attention ≡ Memory | O(1) State


The first production model converted to the SYNAXIM proprietary .symb inference format.

This is Magnus — a custom fine-tuned Mistral-7B-Instruct-v0.3 with axim-alignment and tool-calling support — converted to SYNAXIM's framework-free .symb binary format with INT4 per-group quantization. It runs entirely through the SYNAXIM Symbiotic State Engine — no PyTorch, no Transformers library, no KV-Cache.

Quick Start

1. Install SYNAXIM

pip install git+https://github.com/GRRN-MAKER/SYNAXIM.git

2. Download This Model

pip install huggingface-hub
huggingface-cli download GRRNMAKE/Magnus-SYMB --local-dir ./magnus-symb

3. Run Inference

from grrn_inference import GRRNModel

model = GRRNModel.from_pretrained("./magnus-symb")

result = model.generate("The meaning of life is", max_tokens=50, temperature=0.7)
print(result.text)
print(f"Speed: {result.tokens_per_second} tok/s")

4. Chat (OpenAI-Style)

result = model.chat([
    {"role": "system", "content": "You are Magnus, an AI assistant by GRRNMAKER."},
    {"role": "user", "content": "What can you help me with?"}
], max_tokens=200)

print(result.choices[0].message["content"])

5. Streaming

for chunk in model.stream("Once upon a time", max_tokens=100):
    print(chunk.text, end="", flush=True)

6. Serve as OpenAI API

from grrn_inference import serve
serve("./magnus-symb", port=8000, api_key="my-secret-key")

Model Details

Property Value
Name Magnus
Base Model mistralai/Mistral-7B-Instruct-v0.3
Fine-tune Axim-alignment + tool-calling
Architecture MistralForCausalLM (Dense, GQA)
Parameters ~7.24B
Hidden Size 4096
Layers 32
Attention 32 Q / 8 KV (GQA 4:1)
Head Dim 128
Vocabulary 32,768 tokens
Intermediate Size 14,336
Activation SiLU
RoPE θ 1,000,000
Context 32K
Format .symb (SYNAXIM proprietary binary)
Quantization INT4, group_size=128
Compression 3.8× vs FP16
Total Size ~3.85 GB
Adapter 8-layer Symbiotic Gate adapter (trained)

Symbiotic Gate Adapter

This model includes a trained Symbiotic Gate Adapter (symbiotic_adapter/) that teaches the first 8 of 32 layers to produce coherent output through SYNAXIM's O(1) M-matrix paradigm via knowledge distillation from the original attention mechanism.

Adapter Detail Value
Trained Layers 8 of 32
Parameters ~65K
Method Knowledge distillation (MSE loss vs teacher)
Learned per layer gate_bias, gate_scale, output_scale, mix_alpha

How SYNAXIM Works

SYNAXIM replaces the standard Transformer inference paradigm:

Feature Standard Transformer SYNAXIM
Memory Model KV-Cache (grows with context) O(1) M matrix (fixed size)
Attention Q·K^T·V with stored K,V pairs Sigmoid-gated associative memory
Runtime PyTorch + CUDA NumPy only (zero framework)
Weight Format safetensors (open) .symb (proprietary INT4 bitpacked)
Install Size ~2 GB (PyTorch + deps) < 5 MB

Links

Citation

@software{synaxim,
  title={SYNAXIM: Symbiotic Native Axiom Inference Machine},
  author={GRRNMAKER},
  year={2026},
  url={https://github.com/GRRN-MAKER/SYNAXIM}
}

SYNAXIM — Because inference should be a machine, not a framework.
Built by GRRNMAKER

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GRRNMAKE/SYNAXIM

Finetuned
(515)
this model