Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 29 days ago • 61
Saiga GGUF Collection LLaMA-based Russian chat model in the GGUF format compatible with llama.cpp • 5 items • Updated Apr 19 • 12