T5Gemma-b-b-ul2 Encoder + Tokenizer โ GGUF (for sa3.cpp)
The shared text encoder + tokenizer for sa3.cpp. GGUF conversion of the frozen google/t5gemma-b-b-ul2 encoder (encoder-only at inference) that Stable Audio 3 uses to embed text prompts.
This component is identical across all three SA3 variants (medium, small-music, small-sfx), so it lives in its own repo and is fetched once โ the per-variant conditioner ships separately in each model repo. Validated against the PyTorch reference at cosine similarity ~1.0.
Files
| component | file | notes |
|---|---|---|
| text encoder | t5gemma-b-b-ul2-encoder-0.3B-v1.0-F32.gguf |
encoder weights only (no conditioner) |
| tokenizer | t5gemma-b-b-ul2-v1.0-vocab.gguf |
Gemma byte-fallback BPE |
Pair these with any SA3 variant repo's DiT + SAME + conditioner:
medium ยท
small-music ยท
small-sfx.
tools/download_models.py fetches this repo automatically alongside whichever variant you pick.
License
This is a format conversion of google/t5gemma-b-b-ul2, released under the Gemma Terms of Use (including the use restrictions in Section 3.2). Those terms carry over to this converted encoder + tokenizer.
Relationship to the original
A format conversion (weights โ GGUF) for inference in sa3.cpp โ no retraining. See sa3.cpp/docs/DISTRIBUTION.md.
- Downloads last month
- 15
32-bit
Model tree for thepatch/t5gemma-b-b-ul2-GGUF
Unable to build the model tree, the base model loops to the model itself. Learn more.