Collections

Discover the best community collections!

Collections trending this week
TokenButler
TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity!
Instella ✨
Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs.