Instructions to use tjarvis91/qovaryx-3b-scratch-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tjarvis91/qovaryx-3b-scratch-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="tjarvis91/qovaryx-3b-scratch-base")# Load model directly from transformers import FinanceDecoder model = FinanceDecoder.from_pretrained("tjarvis91/qovaryx-3b-scratch-base", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use tjarvis91/qovaryx-3b-scratch-base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "tjarvis91/qovaryx-3b-scratch-base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjarvis91/qovaryx-3b-scratch-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/tjarvis91/qovaryx-3b-scratch-base
- SGLang
How to use tjarvis91/qovaryx-3b-scratch-base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "tjarvis91/qovaryx-3b-scratch-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjarvis91/qovaryx-3b-scratch-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "tjarvis91/qovaryx-3b-scratch-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjarvis91/qovaryx-3b-scratch-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use tjarvis91/qovaryx-3b-scratch-base with Docker Model Runner:
docker model run hf.co/tjarvis91/qovaryx-3b-scratch-base
Qovaryx scratch-base lineage
A from-scratch own-base checkpoint in the Qovaryx pretrain lineage (50M / 350M / 1B / 3B). Published for research reproducibility.
- Read the research: https://qovaryx.jehorizon.com/research
- Main site: https://qovaryx.jehorizon.com
π¦ Shipped inside the Qovaryx app
This is a component of the Qovaryx Options Decoder cluster. It is published here for transparency + research reproducibility β the runtime is bundled in the desktop app, not installed from Hugging Face. Installer links have been removed from this card.
π Download the signed beta: https://qovaryx.jehorizon.com/download.html π Read the research: https://qovaryx.jehorizon.com/research
π New flagship: Qovaryx Options Decoder β Full Community Runtime
The latest, most capable Qovaryx release is live as a single drop-in package. Six functional HGB specialists + eight vaulted torch heads in one runtime. 15-of-15 internal benchmark cells closed at strict bootstrap CI lower bound. Drop-in replacement for FrankenB / V3.7 / Qwen-VPA. Sub-millisecond inference. Offline. No license email required.
π Qovaryx/qovaryx-options-decoder-full-community
π¬ Join the community
Discord: https://discord.gg/PtuHZDv5ju β builders training their own trading/finance/generalist models. Engineering, no signals. Get install help, share work, follow the Qovaryx research devlog.
Ko-fi: https://ko-fi.com/tjarvis91 β every coffee literally buys GPU time for the next training cycle.
Qovaryx 3B β Scratch Base (random-init)
Compact AI is not small AI. A 3.07B-parameter trainable substrate, the largest in the Qovaryx scratch-base lineage, sized for serious solo pretraining on a single A100 80GB or a 5070 Ti with gradient checkpointing. Random-init β bring your own corpus, train it from scratch. MTP-K=4, GQA 4:1, pluggable FFN backends (dense SwiGLU / ternary BitNet-style / sparse low-rank MoE), optional task-specific heads. Apache-2.0.
π Read the public research: github.com/thron-j/qovaryx-ai-research β philosophy, devlog series (AI without big data centers, legacy brain crystallization, shell-governed cognition, EVO20 training genome, compact frontier architectures, sovereign compact cognition deployed, more). The architecture choices in this checkpoint are described there. Implementation internals are intentionally withheld.
Compact β small
This is the 3B-parameter sibling in a four-base lineage:
- 50M β proxy / smoke
- 350M β solo training target
- 1B β full consumer-GPU target
- 3B β this checkpoint. Trainable on 1Γ A100 80GB without gradient checkpointing; fits on a 5070 Ti 16 GB with GC enabled. Pretrain throughput we have measured: ~8,500 tok/s on 1ΓA100 at batch 8 (bf16, no GC), ~190-660 tok/s on RTX 5070 Ti (GC on, batch 1, grad-accum 8).
Compact in this lineage means engineered for what an individual operator can actually pretrain, not "small enough to fit on a phone." 3B is the sweet spot before frontier-only territory starts.
Architecture
| Parameters | 3,066,688,256 (3.067 B) |
| d_model | 2944 |
| n_layer | 32 |
| n_head | 16 (query) |
| n_kv_head | 4 (GQA 4:1) |
| d_ff | 7680 |
| vocab_size | 32000 |
| max_seq_len | 4096 |
| mtp_k | 4 (multi-token prediction, MLP heads) |
| Tokenizer | english_v1 BPE (in-house, 32K vocab) |
| Positional | RoPE (base 10,000) |
| Norm | RMSNorm |
| FFN default | SwiGLU (swappable to ternary or routed low-rank at the config level) |
| Init | random (torch seed 17) |
| Precision in this checkpoint | bf16 (6.13 GB on disk) |
Implementation: bleeding_edge.model.decoder.FinanceDecoder (the class name is legacy; the architecture is task-agnostic). The trunk is standard pre-norm decoder; multi-token prediction (K=4 with MLP heads) and a routed-decision head are wired in for downstream training but inactive in the random-init weights.
How to load
import torch
from tokenizers import Tokenizer
from bleeding_edge.model.decoder import FinanceDecoder, DecoderConfig
tok = Tokenizer.from_file("tokenizer.json")
ckpt = torch.load("pytorch_model.pt", map_location="cpu", weights_only=False)
cfg = DecoderConfig(**{k: v for k, v in ckpt["model_cfg"].items()
if k in DecoderConfig.__dataclass_fields__})
cfg.vocab_size = tok.get_vocab_size()
model = FinanceDecoder(cfg)
state = {k.removeprefix("_orig_mod."): v for k, v in ckpt["model_state"].items()}
model.load_state_dict(state, strict=False)
model.eval() # outputs noise until you train; this is the point
The bleeding_edge package source ships with the Qovaryx Q-Office-Suite runtime; architecture notes are public at the research devlog.
What this is
A random-init Qovaryx 3B substrate. Same architecture as the sibling scratch bases, same tokenizer, same training pipeline expected β just 3B parameters this time. Drop it into the Qovaryx training loop and pretrain on your own corpus.
The deliberate choices in this design that make 3B-class solo training viable:
- GQA 4:1 instead of MHA β cuts KV-cache memory by 4Γ at inference, cuts attention cost during training.
- SwiGLU FFN, swappable β research-friendly: drop in ternary (BitNet-style) or routed low-rank (sparse MoE) without retraining the trunk.
- MTP-K=4 with MLP heads β auxiliary multi-token-prediction loss baked into pretraining. The K=2 head can serve as a speculative-decode draft at inference.
- No chart encoder by default β disabled in this checkpoint (
chart_patch_encoder_enabled=False). Re-enable inDecoderConfigif your downstream task uses vision tokens. - bf16-native weights β no fp16 underflow drama at 3B scale.
What this is NOT
- Not a pretrained model. Out-of-the-box outputs are noise. Random initialization is the entire point.
- Not finance-specific despite the legacy class name
FinanceDecoder. The architecture is task-agnostic; the BPE tokenizer leans toward English-text merges and works on any English corpus. - Not a drop-in replacement for Llama / Qwen / Mistral. The component set is different (MTP-K heads in particular need their own training term).
- Not adversarially robust. It's a substrate.
- Not the largest Qovaryx base. This is currently the largest open scratch-base in the lineage. Anything bigger requires a different training conversation.
License
Apache-2.0. Use it for research, commercial work, hobby projects β whatever. Attribution appreciated but not legally required.
Research notes
Qovaryx is part of a broader local-sovereign-AI research program. Higher-level framings, architectural rationale, and ablation studies are published progressively at:
Research index: https://github.com/thron-j/qovaryx-ai-research
Implementation details, training corpora, and certain ablation specifics are intentionally withheld in the public devlog. The framings are publishable; the internals are not. Collaboration inquiries: jeherizonllc@gmail.com.
Real training runs on this architecture β Cluster Shell V1 audit
The smaller scratch bases in this lineage are the trainable substrate for the Cluster Shell committee architecture described in the Qovaryx research devlog. The V1 readiness gate, run on the actual trained specialist heads (built on the 50M and 350M siblings), looked like this:
| Specialist | Train rows | Majority baseline | Linear baseline | GBDT baseline | Gate verdict |
|---|---|---|---|---|---|
| Q-Penny | 150K | 52.90% | 73.03% | 73.84% | PASS |
| Q-Veto | 150K | 57.57% | 72.23% | 79.93% | PASS |
| Q-Router | 150K | 24.00% | 76.54% | 84.62% | PASS |
| Q-2yr | 300K | 50.04% | 75.38% | 75.93% | PASS |
| Q-180d | 300K | 50.09% | 74.46% | 74.95% | PASS |
Five specialists, deterministic 5% holdouts, each at least +20pp over the majority-class floor. The architecture clears its falsifiability gate on fresh data β what makes that gate honest is documented in evaluation discipline and when the proxy breaks.
The Qovaryx Q-Office-Suite (released 2026-06-02) extended the cluster-shell pattern to nine sovereign 50M specialists β all at 100% on their held-out audits, all full-fine-tuned from qovaryx-50m-scratch-base. See the eight (now nine) sovereign specialists release.
This 3B substrate is what we use for the next generation: a sovereign 3B base that the cluster shell delegates to (instead of competing with frontier 3B alone).
Sibling models in this lineage
tjarvis91/qovaryx-50m-scratch-baseβ 53.5M params, 12 layers, proxy / smoketjarvis91/qovaryx-350m-scratch-baseβ 352M params, 24 layers, for serious solo trainingtjarvis91/qovaryx-1b-scratch-baseβ 1.05B params, 22 layers, the full consumer-GPU targettjarvis91/qovaryx-3b-scratch-baseβ you are here β 3.07B params, 32 layers, single-A100 80GB or 5070 Ti with GCtjarvis91/vfaix-vpa-options-traderβ a separate, trained 9B vision-language model that uses the same training disciplines on Qwen3.5-VL (not the same architecture; shown here for lineage context)
Watermark / provenance
This release records a SHA256 fingerprint of the random-init state dict inside config.json (model_state_sha256) plus a tokenizer SHA256 (tokenizer_sha256) for tamper-detection and downstream attribution.
{
"init": "random",
"seed": 17,
"params_b": 3.067,
"model_state_sha256": "<see config.json>",
"tokenizer_sha256": "d226a02a00dfab5c3fb58aadb13a3afe2f3635ce0795f73f2857ec3b4fce3704"
}
Support
If this base helps you build something, support continued development:
β ko-fi.com/tjarvis91 β every contribution funds A100 time for the next training cycle and the next-generation Qovaryx scratch bases.
π¬ discord.gg/PtuHZDv5ju β the Qovaryx builder community. Engineering, no signals.
Citation
@misc{qovaryx-3b-scratch-base-2026,
title = {Qovaryx-3B Scratch Base: A 3-Billion-Parameter Compact Decoder Substrate for Solo Pretraining},
author = {Jarvis, Thomas},
year = {2026},
month = {June},
publisher = {Hugging Face},
url = {https://huggingface.co/tjarvis91/qovaryx-3b-scratch-base}
}
Status
Random-init checkpoint as of 2026-06-04. Tag v1.0 upon publish. Future updates will add trained sibling repos (qovaryx-3b-finance-base, qovaryx-3b-instruct) once the first downstream training cycles complete. Watch the org page and the research devlog for new releases.
- Downloads last month
- 59