Supra-1.6-50M-Instruct-Ultra-exp by LogicvexAI
Experimental Chat Tune • Ultra • 50M Parameters • 5K Context
Supra-1.6-50M-Instruct-Ultra-exp is an experimental, highly optimized 50-million-parameter instruction-tuned language model. Built on top of the excellent SupraLabs/Supra-1.5-50M-Base-exp, this model aims to push the architectural limits of ultra-small language models (SLMs).
By utilizing premium, size-calibrated conversational datasets and heavily regularized, orthogonal low-rank parameter updates, this tune aims to deliver cleaner grammatical structure, more stable ChatML turn-taking, and enhanced factual reasoning, all while remaining completely native and edge-deployment ready.
🛑 INDEPENDENT RELEASE & NON-AFFILIATION DISCLAIMER: This model was independently fine-tuned, optimized, and published by MultivexAI / LogicvexAI. We are not affiliated with, endorsed by, or members of SupraLabs. We are independent researchers and huge fans of their open-source pre-training work!
⚠️ QUICK WARNING: This is an experimental model. It can make mistakes, hallucinate facts, or produce inaccurate details. Do not use this model in production or decision-making environments!
Architecture
The model maintains the original Supra-1.5-Base parameter structure and tokenizer. All custom training adapters have been mathematically merged back into the base weights for 0-latency native inference.
| Specification | Value |
|---|---|
| Architecture | LlamaForCausalLM |
| Parameters | ~50M |
| Vocabulary Size | 32,000 |
| Hidden Size | 512 |
| Layers | 12 |
| Attention Heads | 8 |
| KV Heads | 4 |
| Context Length | 5,120 tokens |
| Tokenizer | Original Supra byte-level BPE tokenizer (Formatted with ChatML) |
Model Benchmarks
We evaluated the model against the official test splits of six standard datasets using a mathematically rigorous log-likelihood evaluation harness (lm-evaluation-harness).
Below is the verified comparison across both raw accuracy (acc) and length-normalized accuracy (acc_norm) against the v1.5 baseline.
| Benchmark Task | Metric | Supra-1.5-Instruct (Baseline)* | Supra-1.6-Instruct-Ultra-exp (v1.6) | Performance Delta |
|---|---|---|---|---|
| SciQ | acc acc_norm |
60.90% 57.40% |
72.70% 66.00% |
+11.80% +8.60% |
| PIQA | acc acc_norm |
59.60% 59.30% |
60.61% 59.41% |
+1.01% +0.11% |
| HellaSwag | acc acc_norm |
27.90% 29.30% |
28.11% 29.66% |
+0.21% +0.36% |
| OpenBookQA | acc acc_norm |
17.80% 26.60% |
17.40% 27.20% |
-0.40% +0.60% |
| ARC-Easy | acc acc_norm |
45.90% 44.10% |
46.76% 43.18% |
+0.86% -0.92% |
| ARC-Challenge | acc acc_norm |
22.90% 25.90% |
22.35% 25.60% |
-0.55% -0.30% |
*Supra-1.5-Instruct (Baseline) metrics are sourced directly from the official SupraLabs/Supra-1.5-50M-Instruct-exp repository.
💡 Benchmarking Notes & Parity Disclaimer: These evaluation scores are measured using standard benchmarking configurations. While the evaluation harness is mathematically standardized, minor variations in output can occur based on localized system setups, exact prompt formatting, and tokenizer defaults. These results are shared to provide an objective relative comparison between the model iterations under identical benchmarking conditions.
Intended Use & Chat Template
This model is intended for experimental research, lightweight conversational prototyping, and low-latency edge deployment. It is formatted natively to understand standard ChatML syntax:
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
[Your Question]<|im_end|>
<|im_start|>assistant
Safety & Limitation Disclaimer
1. Factual Inaccuracy & Hallucinations
At approximately 50 million parameters, this model operates under severe physical representation boundaries. It does not have the capacity to maintain a reliable or accurate encyclopedic knowledge base of world facts, history, science, or advanced mathematics. Consequently, the model is highly susceptible to producing confident-sounding but completely fabricated, incorrect, or nonsensical details (hallucinations).
2. Bias, Toxicity, & Sensitive Content
This model was trained on synthetic conversational mixtures and open-source instruction corpuses. It may inadvertently mirror, amplify, or generate biased, stereotypical, offensive, or otherwise sensitive content. Users must implement external guardrails and content filters if deploying this model in interactive environments.
3. Non-Production Use Warning
This repository is strictly an experimental research release. Under no circumstances should this model be deployed in production, commercial, clinical, legal, or safety-critical applications where incorrect, biased, or hallucinated outputs could result in physical, financial, emotional, or operational harm.
4. Code & Symbolic Computation Limits
While the model can structure basic code snippets, it lacks the deep state-tracking needed to generate valid, compiler-ready, or secure programming scripts or perform accurate mathematical calculations. Do not execute any code generated by this model without thorough, expert human validation.
By downloading, utilizing, or fine-tuning this model, you acknowledge these severe architectural limitations and agree to use the weights responsibly and entirely at your own risk.
Acknowledgements & Special Thanks
This release is built upon the foundation laid by the SupraLabs team. We are grateful for their work in pre-training and releasing the base model.
- Base Model:
Supra-1.5-50M-Base-exp - Training Scale: 3.0B+ diverse continued-pretraining tokens
- SupraLabs Contributors:
@AxionLab-official,@LH-Tech-AI,@LyJonathan@MMorgan-ML,@QyrouNnet-AI,@Jamessl,@User01110
MultivexAI is proud to contribute this experimental tune to the open-source community as a continuation of their foundational work.
Feedback and Support
Feedback and community contributions are highly welcomed. Please open an issue or discussion on the repository if you want to share your evaluations, fine-tunes, or GGUF conversions!
- Downloads last month
- 28
Model tree for MultivexAI/Supra-1.6-50M-Instruct-Ultra-exp
Base model
SupraLabs/Supra-50M-Base