https://huggingface.co/RedHatAI/gemma-4-31B-it-speculator.eagle3

#2555

by Laetilia - opened 11 days ago

https://huggingface.co/RedHatAI/gemma-4-31B-it-speculator.eagle3

Support for EAGLE3 speculative decoding models was recently added to llama.cpp; there is quite many of these eagle3 models, and I think that it is reasonable to start GGUF'ing with this one, since it is directly mentioned in the merged PR, and is also of a fresh good non-MoE (gets more speedup, to my understanding) model.

Thank you for the wonderful quantization work you do!

And in case you are very curious to know more of these eagle3 things...
Here's the PR ~ https://github.com/ggml-org/llama.cpp/pull/18039
And here's a reddit discussion (I am happy that the old design of reddit is still an option) ~ https://old.reddit.com/r/LocalLLaMA/comments/1u3on4u/eagle3_has_landed_in_llamacpp/

RichardErkhov

11 days ago

It's queued! Im not a public reddit user, but I updated llama cpp, and if it works you can share the link for our quant =)

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#gemma-4-31B-it-speculator.eagle3-GGUF for quants to appear.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment