T5Gemma-b-b-ul2 Encoder + Tokenizer โ€” GGUF (for sa3.cpp)

The shared text encoder + tokenizer for sa3.cpp. GGUF conversion of the frozen google/t5gemma-b-b-ul2 encoder (encoder-only at inference) that Stable Audio 3 uses to embed text prompts.

This component is identical across all three SA3 variants (medium, small-music, small-sfx), so it lives in its own repo and is fetched once โ€” the per-variant conditioner ships separately in each model repo. Validated against the PyTorch reference at cosine similarity ~1.0.

Files

component file notes
text encoder t5gemma-b-b-ul2-encoder-0.3B-v1.0-F32.gguf encoder weights only (no conditioner)
tokenizer t5gemma-b-b-ul2-v1.0-vocab.gguf Gemma byte-fallback BPE

Pair these with any SA3 variant repo's DiT + SAME + conditioner: medium ยท small-music ยท small-sfx. tools/download_models.py fetches this repo automatically alongside whichever variant you pick.

License

This is a format conversion of google/t5gemma-b-b-ul2, released under the Gemma Terms of Use (including the use restrictions in Section 3.2). Those terms carry over to this converted encoder + tokenizer.

Relationship to the original

A format conversion (weights โ†’ GGUF) for inference in sa3.cpp โ€” no retraining. See sa3.cpp/docs/DISTRIBUTION.md.

Downloads last month
15
GGUF
Model size
0.3B params
Architecture
sa3-t5gemma
Hardware compatibility
Log In to add your hardware

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thepatch/t5gemma-b-b-ul2-GGUF

Unable to build the model tree, the base model loops to the model itself. Learn more.