Cosmos3-Nano β€” NF4 4-bit Pre-Quantized Transformer

Pre-quantized NF4 (4-bit, double-quantized) version of NVIDIA's nvidia/Cosmos3-Nano omnimodal world model, created with bitsandbytes. Only the large Cosmos3OmniTransformer is quantized; the VAE and the text/sound tokenizers are bundled unchanged at bf16, so the repo is self-contained and fully drop-in.

This loads in seconds with no runtime quantization pass β€” on-the-fly NF4 of the bf16 original takes minutes every load; this bakes it once.

Key Details

Property Value
Repo size 11 GB (vs ~34 GB bf16)
Quantized component transformer β€” 8.3 GB NF4 (vs ~32 GB bf16)
Quantization NF4 (bitsandbytes), double quantization, bnb_4bit_compute_dtype=bfloat16
Modes text-to-image, text-to-video, image-to-video (+ optional sound)
Base params 16B (omnimodal)
VRAM (loaded) ~11 GB
Source weights nvidia/Cosmos3-Nano (bf16)
Tested on NVIDIA GB10 (DGX Spark)

Usage

Requires a diffusers build with Cosmos 3 support (currently from source) plus bitsandbytes. The NF4 config is embedded β€” do not pass a quantization_config, and do not call .to(dtype) on a 4-bit model.

pip install "git+https://github.com/huggingface/diffusers.git" bitsandbytes accelerate
import torch
from diffusers import Cosmos3OmniPipeline

pipe = Cosmos3OmniPipeline.from_pretrained(
    "SanDiegoDude/Cosmos3-Nano-nf4",
    torch_dtype=torch.bfloat16,
    enable_safety_checker=False,  # skips the optional cosmos_guardrail dependency
).to("cuda")

result = pipe("A small warehouse robot beside a blue box, clean studio lighting.")
frames = result.video[0]          # text-to-image returns a single frame
frames[0].save("out.png")

ComfyUI

A turnkey loader + T2I / T2V / I2V nodes are available in scg-Cosmos3. The loader auto-detects this pre-quantized layout and skips the re-quant pass.

Related Repos

License

Released under NVIDIA's OpenMDW 1.1 License, inherited from the base model. Quantization only changes the weight encoding.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SanDiegoDude/Cosmos3-Nano-nf4

Quantized
(7)
this model