You can now Run DiffusionGemma in Unsloth Studio UI ✨

#3
by danielhanchen - opened
Unsloth AI org

Hey guys, you can now run and train DiffusionGemma in Unsloth Studio. GitHub
Recommended inference settings are automatically set. Guide

Example of DiffusionGemma (4-bit GGUF) running in Unsloth Studio with executable code:

diffusiongemma in unsloth studio
danielhanchen pinned discussion

I have a 5090, and after loading the DiffusionGemma (Q4) model, I only have a max context window of 8192 tokens. My GPU isn't even full; it's sitting at 28/32 GB used. In addition, I have 128 GB of RAM which is not being used. I'm not sure what is happening, but every time I try to set the context beyond 8k and reset, it reverts back to 8k.

im on 5090 - https://gist.github.com/johndpope/5aa9af3edd22cbe3dd28ab2719081916 if you just want to play - you can use vllm + docker - this works - for 3090 - i couldn't get it working...
(it will fill up your local hf cache if you monitor it - first boot will take forever to cache llm)

Sign up or log in to comment