Supra-1.6-50M-Instruct-Ultra-exp by LogicvexAI

Experimental Chat Tune • Ultra • 50M Parameters • 5K Context

Supra-1.6-50M-Instruct-Ultra-exp is an experimental, highly optimized 50-million-parameter instruction-tuned language model. Built on top of the excellent SupraLabs/Supra-1.5-50M-Base-exp, this model aims to push the architectural limits of ultra-small language models (SLMs).

By utilizing premium, size-calibrated conversational datasets and heavily regularized, orthogonal low-rank parameter updates, this tune aims to deliver cleaner grammatical structure, more stable ChatML turn-taking, and enhanced factual reasoning, all while remaining completely native and edge-deployment ready.

🛑 INDEPENDENT RELEASE & NON-AFFILIATION DISCLAIMER: This model was independently fine-tuned, optimized, and published by MultivexAI / LogicvexAI. We are not affiliated with, endorsed by, or members of SupraLabs. We are independent researchers and huge fans of their open-source pre-training work!

⚠️ QUICK WARNING: This is an experimental model. It can make mistakes, hallucinate facts, or produce inaccurate details. Do not use this model in production or decision-making environments!

Architecture

The model maintains the original Supra-1.5-Base parameter structure and tokenizer. All custom training adapters have been mathematically merged back into the base weights for 0-latency native inference.

Specification	Value
Architecture	`LlamaForCausalLM`
Parameters	~50M
Vocabulary Size	32,000
Hidden Size	512
Layers	12
Attention Heads	8
KV Heads	4
Context Length	5,120 tokens
Tokenizer	Original Supra byte-level BPE tokenizer (Formatted with ChatML)

Model Benchmarks

We evaluated the model against the official test splits of six standard datasets using a mathematically rigorous log-likelihood evaluation harness (lm-evaluation-harness).

Below is the verified comparison across both raw accuracy (acc) and length-normalized accuracy (acc_norm) against the v1.5 baseline.

Benchmark Task	Metric	Supra-1.5-Instruct (Baseline)*	Supra-1.6-Instruct-Ultra-exp (v1.6)	Performance Delta
SciQ	`acc` `acc_norm`	`60.90%` `57.40%`	`72.70%` `66.00%`	`+11.80%` `+8.60%`
PIQA	`acc` `acc_norm`	`59.60%` `59.30%`	`60.61%` `59.41%`	`+1.01%` `+0.11%`
HellaSwag	`acc` `acc_norm`	`27.90%` `29.30%`	`28.11%` `29.66%`	`+0.21%` `+0.36%`
OpenBookQA	`acc` `acc_norm`	`17.80%` `26.60%`	`17.40%` `27.20%`	`-0.40%` `+0.60%`
ARC-Easy	`acc` `acc_norm`	`45.90%` `44.10%`	`46.76%` `43.18%`	`+0.86%` `-0.92%`
ARC-Challenge	`acc` `acc_norm`	`22.90%` `25.90%`	`22.35%` `25.60%`	`-0.55%` `-0.30%`

*Supra-1.5-Instruct (Baseline) metrics are sourced directly from the official SupraLabs/Supra-1.5-50M-Instruct-exp repository.

💡 Benchmarking Notes & Parity Disclaimer: These evaluation scores are measured using standard benchmarking configurations. While the evaluation harness is mathematically standardized, minor variations in output can occur based on localized system setups, exact prompt formatting, and tokenizer defaults. These results are shared to provide an objective relative comparison between the model iterations under identical benchmarking conditions.

Intended Use & Chat Template

This model is intended for experimental research, lightweight conversational prototyping, and low-latency edge deployment. It is formatted natively to understand standard ChatML syntax:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
[Your Question]<|im_end|>
<|im_start|>assistant

Safety & Limitation Disclaimer

1. Factual Inaccuracy & Hallucinations

At approximately 50 million parameters, this model operates under severe physical representation boundaries. It does not have the capacity to maintain a reliable or accurate encyclopedic knowledge base of world facts, history, science, or advanced mathematics. Consequently, the model is highly susceptible to producing confident-sounding but completely fabricated, incorrect, or nonsensical details (hallucinations).

2. Bias, Toxicity, & Sensitive Content

This model was trained on synthetic conversational mixtures and open-source instruction corpuses. It may inadvertently mirror, amplify, or generate biased, stereotypical, offensive, or otherwise sensitive content. Users must implement external guardrails and content filters if deploying this model in interactive environments.

3. Non-Production Use Warning

This repository is strictly an experimental research release. Under no circumstances should this model be deployed in production, commercial, clinical, legal, or safety-critical applications where incorrect, biased, or hallucinated outputs could result in physical, financial, emotional, or operational harm.

4. Code & Symbolic Computation Limits

While the model can structure basic code snippets, it lacks the deep state-tracking needed to generate valid, compiler-ready, or secure programming scripts or perform accurate mathematical calculations. Do not execute any code generated by this model without thorough, expert human validation.

By downloading, utilizing, or fine-tuning this model, you acknowledge these severe architectural limitations and agree to use the weights responsibly and entirely at your own risk.

Acknowledgements & Special Thanks

This release is built upon the foundation laid by the SupraLabs team. We are grateful for their work in pre-training and releasing the base model.

Base Model: Supra-1.5-50M-Base-exp
Training Scale: 3.0B+ diverse continued-pretraining tokens
SupraLabs Contributors:
- @AxionLab-official, @LH-Tech-AI, @LyJonathan
- @MMorgan-ML, @QyrouNnet-AI, @Jamessl, @User01110

MultivexAI is proud to contribute this experimental tune to the open-source community as a continuation of their foundational work.

Feedback and Support

Feedback and community contributions are highly welcomed. Please open an issue or discussion on the repository if you want to share your evaluations, fine-tunes, or GGUF conversions!