Magnus — SYNAXIM .symb Format (INT4)
Powered by SYNAXIM — Symbiotic Native Axiom Inference Machine
Framework-Free LLM Inference | Attention ≡ Memory | O(1) State
The first production model converted to the SYNAXIM proprietary .symb inference format.
This is Magnus — a custom fine-tuned Mistral-7B-Instruct-v0.3 with axim-alignment and tool-calling support — converted to SYNAXIM's framework-free .symb binary format with INT4 per-group quantization. It runs entirely through the SYNAXIM Symbiotic State Engine — no PyTorch, no Transformers library, no KV-Cache.
Quick Start
1. Install SYNAXIM
pip install git+https://github.com/GRRN-MAKER/SYNAXIM.git
2. Download This Model
pip install huggingface-hub
huggingface-cli download GRRNMAKE/Magnus-SYMB --local-dir ./magnus-symb
3. Run Inference
from grrn_inference import GRRNModel
model = GRRNModel.from_pretrained("./magnus-symb")
result = model.generate("The meaning of life is", max_tokens=50, temperature=0.7)
print(result.text)
print(f"Speed: {result.tokens_per_second} tok/s")
4. Chat (OpenAI-Style)
result = model.chat([
{"role": "system", "content": "You are Magnus, an AI assistant by GRRNMAKER."},
{"role": "user", "content": "What can you help me with?"}
], max_tokens=200)
print(result.choices[0].message["content"])
5. Streaming
for chunk in model.stream("Once upon a time", max_tokens=100):
print(chunk.text, end="", flush=True)
6. Serve as OpenAI API
from grrn_inference import serve
serve("./magnus-symb", port=8000, api_key="my-secret-key")
Model Details
| Property | Value |
|---|---|
| Name | Magnus |
| Base Model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tune | Axim-alignment + tool-calling |
| Architecture | MistralForCausalLM (Dense, GQA) |
| Parameters | ~7.24B |
| Hidden Size | 4096 |
| Layers | 32 |
| Attention | 32 Q / 8 KV (GQA 4:1) |
| Head Dim | 128 |
| Vocabulary | 32,768 tokens |
| Intermediate Size | 14,336 |
| Activation | SiLU |
| RoPE θ | 1,000,000 |
| Context | 32K |
| Format | .symb (SYNAXIM proprietary binary) |
| Quantization | INT4, group_size=128 |
| Compression | 3.8× vs FP16 |
| Total Size | ~3.85 GB |
| Adapter | 8-layer Symbiotic Gate adapter (trained) |
Symbiotic Gate Adapter
This model includes a trained Symbiotic Gate Adapter (symbiotic_adapter/) that teaches the first 8 of 32 layers to produce coherent output through SYNAXIM's O(1) M-matrix paradigm via knowledge distillation from the original attention mechanism.
| Adapter Detail | Value |
|---|---|
| Trained Layers | 8 of 32 |
| Parameters | ~65K |
| Method | Knowledge distillation (MSE loss vs teacher) |
| Learned per layer | gate_bias, gate_scale, output_scale, mix_alpha |
How SYNAXIM Works
SYNAXIM replaces the standard Transformer inference paradigm:
| Feature | Standard Transformer | SYNAXIM |
|---|---|---|
| Memory Model | KV-Cache (grows with context) | O(1) M matrix (fixed size) |
| Attention | Q·K^T·V with stored K,V pairs | Sigmoid-gated associative memory |
| Runtime | PyTorch + CUDA | NumPy only (zero framework) |
| Weight Format | safetensors (open) | .symb (proprietary INT4 bitpacked) |
| Install Size | ~2 GB (PyTorch + deps) | < 5 MB |
Links
- Engine Source: github.com/GRRN-MAKER/SYNAXIM
- Original Model: GRRNMAKE/Magnus
- Converted Model: GRRNMAKE/Magnus-SYMB
- Author: GRRNMAKER
Citation
@software{synaxim,
title={SYNAXIM: Symbiotic Native Axiom Inference Machine},
author={GRRNMAKER},
year={2026},
url={https://github.com/GRRN-MAKER/SYNAXIM}
}
SYNAXIM — Because inference should be a machine, not a framework.
Built by GRRNMAKER
Model tree for GRRNMAKE/SYNAXIM
Base model
mistralai/Mistral-7B-v0.3