Ornstein 3.5 9B — V2

Ornstein 3.5 9B — V2 · NVFP4

NVFP4 4-bit quantization (Blackwell-optimized) with FP8 KV cache, produced with NVIDIA TensorRT Model Optimizer (PTQ) from GestaltLabs/Ornstein-3.5-9B-V2 — the reinforcement-learning post-training (V2) of Ornstein 3.5 9B. Text weights only; pair with the full base model for the vision tower.

Usage

These ModelOpt checkpoints carry an hf_quant_config.json and are intended for ModelOpt-aware runtimes (TensorRT-LLM / vLLM). Load the directory as a standard HF model id with the matching backend.

Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. All training compute is self-funded — balancing GPU costs against a student budget. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

Support on Ko-fi

License

Apache 2.0 — inherited from the Qwen 3.5 9B base release.

Downloads last month
-
Safetensors
Model size
6B params
Tensor type
BF16
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GestaltLabs/Ornstein-3.5-9B-V2-NVFP4

Quantized
(3)
this model