Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 1 day ago • 41
NousResearch/DeepHermes-3-Llama-3-3B-Preview Text Generation • Updated about 22 hours ago • 769 • 13
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 2 days ago • 232
Q-Filters Collection Pre-computed Q-Filters for efficient KV cache compression. • 15 items • Updated 11 days ago • 6