Q-DocCite-50M-Sovereign β€” Document citation β€” extract facts with page anchors

Find the fact. Cite the page. Refuse when the evidence isn't there.

What this model does, in one sentence

Given a documentation excerpt and a question, returns the answer with an inline citation tag like [page 24] pointing to the source page. Refuses to answer when the evidence is not in the provided excerpt.

Honest performance

  • Task: document citation
  • Metric: citation (predicted answer contains gold content AND gold citation tag)
  • Holdout: n=60 rows, never seen in training, scored row-by-row
  • Score: 100.0% mean
  • Bootstrap CI 95% lower bound: 1.000
  • Gate threshold: 0.90
  • Verdict: PASS at point estimate AND at bootstrap CI lower bound

What it's used for β€” real workflows

  • Internal documentation Q&A β€” Wrap a Q-DocCite call around your docs retrieval layer. Every answer carries a [page N] tag that's verifiable. No more 'where did the bot get that?'
  • Annual report fact extraction β€” Feed pages of a 10-K or annual report; ask for the revenue, the headcount, the segment breakdown. Every fact comes with a page anchor your auditor can check.
  • Compliance review evidence β€” Policy or SOP citation: "What is the data-retention window per page 47?" β€” get the answer and the citation, side by side.
  • Refuse-when-no-evidence pattern β€” If the excerpt doesn't contain the fact, Q-DocCite says so explicitly instead of hallucinating a number. That's the actual hard part.

What problem this actually solves

RAG pipelines that emit fluent answers without a verifiable citation create real legal and audit risk. Q-DocCite trains the citation tag as a first-class output. The page anchor isn't decoration; it's the contract. If the model can't find the fact in the excerpt, the trained behavior is to refuse β€” that's the hardest pattern to teach a general LM, and the audit shows we got it.

Integration paths

  • Step in a RAG pipeline β€” Replace the 'answer + cite' step in your existing retrieval flow. Q-DocCite consumes the retrieved excerpts and emits the cited answer.
  • Q-Office-Suite runtime β€” POST /run/q-doccite β€” bundle multiple specialists behind one binary.
  • On-device legal/medical review β€” CPU-only inference keeps sensitive documents inside the org perimeter.

Example

Input:

Excerpt:
[page 18] Free cash flow was $30B.

Q: What was free cash flow?

Output:

Free cash flow was $30B. [page 18]

What this is NOT

  • Not a general-purpose chatbot. This head does one job and does it consistently. Free-text generation outside the trained task surface will degrade.
  • Not a replacement for a verifier. This is one component in the Qovaryx cluster-shell architecture. The decision-acceptance discipline lives in the wrapper, not in the head.
  • Not reproducible from this card. Weights and audit are public; the crystal corpus, eval gate constants, and training hyperparameters are not.

Proprietary Qovaryx technology β€” built on our own scratch base

This is a 53.5M-parameter sovereign specialist in the Qovaryx Compact Specialist Suite. It is full-fine-tuned from tjarvis91/qovaryx-50m-scratch-base β€” our own scratch-trained base, not a borrowed foundation model.

  • Base: Qovaryx 50M scratch base. Pretrained from random initialization on 491.5M tokens. Not SmolLM2. Not Qwen. Not Llama. Not Mistral. Not Phi. No HuggingFace foundation. No closed-source weights. Every parameter traces back to a Qovaryx training run on Qovaryx hardware.
  • Tokenizer: Qovaryx english_v1 BPE (vocab 32000), built in-house against our own pretraining corpus.
  • Architecture: Qovaryx FinanceDecoder β€” 12 decoder blocks, GQA, RoPE, SwiGLU FFN, RMSNorm, MTP heads, decision head.
  • Recipe: Qovaryx crystallization discipline β€” train the law before replaying the noise.
  • Runs on CPU. No GPU required at inference.

Architecture (Qovaryx proprietary)

  • 53.5M parameters
  • 12 decoder blocks, d_model=512, n_head=8, GQA n_kv_head=2
  • SwiGLU FFN, RoPE positional, RMSNorm
  • Multi-token prediction (MTP) auxiliary heads
  • Decision head for routed-decision tasks
  • Tokenizer: Qovaryx english_v1 BPE, vocab 32000 (in-house build)
  • Pretrained from qovaryx-50m-scratch-base step 60000 β€” 491.5M tokens
  • Full fine-tune (no LoRA, no QLoRA, no adapter): every parameter was updated on the Qovaryx crystal corpus for this specialist

How to load it (Python)

import torch
from tokenizers import Tokenizer
from bleeding_edge.model.decoder import FinanceDecoder, DecoderConfig

tok = Tokenizer.from_file("tokenizer.json")
ckpt = torch.load("pytorch_model.pt", map_location="cpu", weights_only=False)
cfg = DecoderConfig(**{k: v for k, v in ckpt["model_cfg"].items() if k in DecoderConfig.__dataclass_fields__})
cfg.vocab_size = tok.get_vocab_size()
model = FinanceDecoder(cfg).eval()
state = {k.removeprefix("_orig_mod."): v for k, v in ckpt["model_state"].items()}
model.load_state_dict(state, strict=False)

prompt = "Excerpt:\n[page 18] Free cash flow was $30B.\n\nQ: What was free cash flow?"
ids = tok.encode(prompt).ids
cur = torch.tensor([ids], dtype=torch.long)
with torch.no_grad():
    for _ in range(120):
        nxt = int(torch.argmax(model(cur, return_decision=False).logits[:, -1, :], dim=-1))
        if nxt == 0: break
        cur = torch.cat([cur, torch.tensor([[nxt]])], dim=1)
print(tok.decode(cur[0].tolist()[len(ids):]))

License & posture

Apache 2.0 for the published weights, model card, and example code.

The Qovaryx scratch base build pipeline, the crystallization corpus, the eval gate constants, the cluster routing policy, and the protected runtime entrypoint are Qovaryx proprietary technology and are not included in this release. Same posture as every previous Qovaryx public release: ship the weights and the audit, not the recipe.

Sibling specialists in the Qovaryx Q-Office-Suite

All nine specialists share the qovaryx-50m-scratch-base and the same audit discipline. Use one directly; use all nine through the cluster shell.

Watermark

This release carries a SHA256 issue fingerprint inside release.json for tamper-detection and attribution.

Community & support

If you find a failure mode this card doesn't cover, open a discussion on this repo or come to the Discord β€” that's how the next crystal corpus gets written.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for tjarvis91/Q-DocCite-50M-Sovereign

Finetuned
(9)
this model