Instructions to use tjarvis91/qovaryx-3b-scratch-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tjarvis91/qovaryx-3b-scratch-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="tjarvis91/qovaryx-3b-scratch-base")

# Load model directly
from transformers import FinanceDecoder
model = FinanceDecoder.from_pretrained("tjarvis91/qovaryx-3b-scratch-base", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use tjarvis91/qovaryx-3b-scratch-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tjarvis91/qovaryx-3b-scratch-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjarvis91/qovaryx-3b-scratch-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/tjarvis91/qovaryx-3b-scratch-base

SGLang

How to use tjarvis91/qovaryx-3b-scratch-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tjarvis91/qovaryx-3b-scratch-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjarvis91/qovaryx-3b-scratch-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tjarvis91/qovaryx-3b-scratch-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjarvis91/qovaryx-3b-scratch-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use tjarvis91/qovaryx-3b-scratch-base with Docker Model Runner:
```
docker model run hf.co/tjarvis91/qovaryx-3b-scratch-base
```

Qovaryx scratch-base lineage

A from-scratch own-base checkpoint in the Qovaryx pretrain lineage (50M / 350M / 1B / 3B). Published for research reproducibility.

Read the research: https://qovaryx.jehorizon.com/research

Main site: https://qovaryx.jehorizon.com

📦 Shipped inside the Qovaryx app

This is a component of the Qovaryx Options Decoder cluster. It is published here for transparency + research reproducibility — the runtime is bundled in the desktop app, not installed from Hugging Face. Installer links have been removed from this card.

👉 Download the signed beta: https://qovaryx.jehorizon.com/download.html 📖 Read the research: https://qovaryx.jehorizon.com/research

🚀 New flagship: Qovaryx Options Decoder — Full Community Runtime

The latest, most capable Qovaryx release is live as a single drop-in package. Six functional HGB specialists + eight vaulted torch heads in one runtime. 15-of-15 internal benchmark cells closed at strict bootstrap CI lower bound. Drop-in replacement for FrankenB / V3.7 / Qwen-VPA. Sub-millisecond inference. Offline. No license email required.

👉 Qovaryx/qovaryx-options-decoder-full-community

💬 Join the community

Discord: https://discord.gg/PtuHZDv5ju — builders training their own trading/finance/generalist models. Engineering, no signals. Get install help, share work, follow the Qovaryx research devlog.

Ko-fi: https://ko-fi.com/tjarvis91 — every coffee literally buys GPU time for the next training cycle.

Qovaryx 3B — Scratch Base (random-init)

Compact AI is not small AI. A 3.07B-parameter trainable substrate, the largest in the Qovaryx scratch-base lineage, sized for serious solo pretraining on a single A100 80GB or a 5070 Ti with gradient checkpointing. Random-init — bring your own corpus, train it from scratch. MTP-K=4, GQA 4:1, pluggable FFN backends (dense SwiGLU / ternary BitNet-style / sparse low-rank MoE), optional task-specific heads. Apache-2.0.

📖 Read the public research: github.com/thron-j/qovaryx-ai-research — philosophy, devlog series (AI without big data centers, legacy brain crystallization, shell-governed cognition, EVO20 training genome, compact frontier architectures, sovereign compact cognition deployed, more). The architecture choices in this checkpoint are described there. Implementation internals are intentionally withheld.

Compact ≠ small

This is the 3B-parameter sibling in a four-base lineage:

50M — proxy / smoke
350M — solo training target
1B — full consumer-GPU target
3B — this checkpoint. Trainable on 1× A100 80GB without gradient checkpointing; fits on a 5070 Ti 16 GB with GC enabled. Pretrain throughput we have measured: ~8,500 tok/s on 1×A100 at batch 8 (bf16, no GC), ~190-660 tok/s on RTX 5070 Ti (GC on, batch 1, grad-accum 8).

Compact in this lineage means engineered for what an individual operator can actually pretrain, not "small enough to fit on a phone." 3B is the sweet spot before frontier-only territory starts.

Architecture


Parameters	3,066,688,256 (3.067 B)
d_model	2944
n_layer	32
n_head	16 (query)
n_kv_head	4 (GQA 4:1)
d_ff	7680
vocab_size	32000
max_seq_len	4096
mtp_k	4 (multi-token prediction, MLP heads)
Tokenizer	`english_v1` BPE (in-house, 32K vocab)
Positional	RoPE (base 10,000)
Norm	RMSNorm
FFN default	SwiGLU (swappable to ternary or routed low-rank at the config level)
Init	random (torch seed 17)
Precision in this checkpoint	bf16 (6.13 GB on disk)

Implementation: bleeding_edge.model.decoder.FinanceDecoder (the class name is legacy; the architecture is task-agnostic). The trunk is standard pre-norm decoder; multi-token prediction (K=4 with MLP heads) and a routed-decision head are wired in for downstream training but inactive in the random-init weights.

How to load

import torch
from tokenizers import Tokenizer
from bleeding_edge.model.decoder import FinanceDecoder, DecoderConfig

tok = Tokenizer.from_file("tokenizer.json")
ckpt = torch.load("pytorch_model.pt", map_location="cpu", weights_only=False)

cfg = DecoderConfig(**{k: v for k, v in ckpt["model_cfg"].items()
                       if k in DecoderConfig.__dataclass_fields__})
cfg.vocab_size = tok.get_vocab_size()

model = FinanceDecoder(cfg)
state = {k.removeprefix("_orig_mod."): v for k, v in ckpt["model_state"].items()}
model.load_state_dict(state, strict=False)
model.eval()  # outputs noise until you train; this is the point

The bleeding_edge package source ships with the Qovaryx Q-Office-Suite runtime; architecture notes are public at the research devlog.

What this is

A random-init Qovaryx 3B substrate. Same architecture as the sibling scratch bases, same tokenizer, same training pipeline expected — just 3B parameters this time. Drop it into the Qovaryx training loop and pretrain on your own corpus.

The deliberate choices in this design that make 3B-class solo training viable:

GQA 4:1 instead of MHA — cuts KV-cache memory by 4× at inference, cuts attention cost during training.
SwiGLU FFN, swappable — research-friendly: drop in ternary (BitNet-style) or routed low-rank (sparse MoE) without retraining the trunk.
MTP-K=4 with MLP heads — auxiliary multi-token-prediction loss baked into pretraining. The K=2 head can serve as a speculative-decode draft at inference.
No chart encoder by default — disabled in this checkpoint (chart_patch_encoder_enabled=False). Re-enable in DecoderConfig if your downstream task uses vision tokens.
bf16-native weights — no fp16 underflow drama at 3B scale.

What this is NOT

Not a pretrained model. Out-of-the-box outputs are noise. Random initialization is the entire point.
Not finance-specific despite the legacy class name FinanceDecoder. The architecture is task-agnostic; the BPE tokenizer leans toward English-text merges and works on any English corpus.
Not a drop-in replacement for Llama / Qwen / Mistral. The component set is different (MTP-K heads in particular need their own training term).
Not adversarially robust. It's a substrate.
Not the largest Qovaryx base. This is currently the largest open scratch-base in the lineage. Anything bigger requires a different training conversation.

License

Apache-2.0. Use it for research, commercial work, hobby projects — whatever. Attribution appreciated but not legally required.

Research notes

Qovaryx is part of a broader local-sovereign-AI research program. Higher-level framings, architectural rationale, and ablation studies are published progressively at:

Research index: https://github.com/thron-j/qovaryx-ai-research

Implementation details, training corpora, and certain ablation specifics are intentionally withheld in the public devlog. The framings are publishable; the internals are not. Collaboration inquiries: jeherizonllc@gmail.com.

Real training runs on this architecture — Cluster Shell V1 audit

The smaller scratch bases in this lineage are the trainable substrate for the Cluster Shell committee architecture described in the Qovaryx research devlog. The V1 readiness gate, run on the actual trained specialist heads (built on the 50M and 350M siblings), looked like this:

Specialist	Train rows	Majority baseline	Linear baseline	GBDT baseline	Gate verdict
Q-Penny	150K	52.90%	73.03%	73.84%	PASS
Q-Veto	150K	57.57%	72.23%	79.93%	PASS
Q-Router	150K	24.00%	76.54%	84.62%	PASS
Q-2yr	300K	50.04%	75.38%	75.93%	PASS
Q-180d	300K	50.09%	74.46%	74.95%	PASS

Five specialists, deterministic 5% holdouts, each at least +20pp over the majority-class floor. The architecture clears its falsifiability gate on fresh data — what makes that gate honest is documented in evaluation discipline and when the proxy breaks.

The Qovaryx Q-Office-Suite (released 2026-06-02) extended the cluster-shell pattern to nine sovereign 50M specialists — all at 100% on their held-out audits, all full-fine-tuned from qovaryx-50m-scratch-base. See the eight (now nine) sovereign specialists release.

This 3B substrate is what we use for the next generation: a sovereign 3B base that the cluster shell delegates to (instead of competing with frontier 3B alone).

Sibling models in this lineage

tjarvis91/qovaryx-50m-scratch-base — 53.5M params, 12 layers, proxy / smoke
tjarvis91/qovaryx-350m-scratch-base — 352M params, 24 layers, for serious solo training
tjarvis91/qovaryx-1b-scratch-base — 1.05B params, 22 layers, the full consumer-GPU target
tjarvis91/qovaryx-3b-scratch-base ← you are here — 3.07B params, 32 layers, single-A100 80GB or 5070 Ti with GC
tjarvis91/vfaix-vpa-options-trader — a separate, trained 9B vision-language model that uses the same training disciplines on Qwen3.5-VL (not the same architecture; shown here for lineage context)

Watermark / provenance

This release records a SHA256 fingerprint of the random-init state dict inside config.json (model_state_sha256) plus a tokenizer SHA256 (tokenizer_sha256) for tamper-detection and downstream attribution.

{
  "init": "random",
  "seed": 17,
  "params_b": 3.067,
  "model_state_sha256": "<see config.json>",
  "tokenizer_sha256": "d226a02a00dfab5c3fb58aadb13a3afe2f3635ce0795f73f2857ec3b4fce3704"
}

Support

If this base helps you build something, support continued development:

☕ ko-fi.com/tjarvis91 — every contribution funds A100 time for the next training cycle and the next-generation Qovaryx scratch bases.

💬 discord.gg/PtuHZDv5ju — the Qovaryx builder community. Engineering, no signals.

Citation

@misc{qovaryx-3b-scratch-base-2026,
  title     = {Qovaryx-3B Scratch Base: A 3-Billion-Parameter Compact Decoder Substrate for Solo Pretraining},
  author    = {Jarvis, Thomas},
  year      = {2026},
  month     = {June},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/tjarvis91/qovaryx-3b-scratch-base}
}

Status

Random-init checkpoint as of 2026-06-04. Tag v1.0 upon publish. Future updates will add trained sibling repos (qovaryx-3b-finance-base, qovaryx-3b-instruct) once the first downstream training cycles complete. Watch the org page and the research devlog for new releases.

Downloads last month: 59