PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published 9 days ago • 106
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 4 days ago • 59
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published 20 days ago • 39
TxGemma Release Collection Collection of open models to accelerate the development of therapeutics. • 5 items • Updated 13 days ago • 46
Running 68 68 LLM Embeddings Explained: A Visual and Intuitive Guide 🚀 How Language Models Turn Text into Meaning, From Traditional
mlx-community/QwQ-DeepSeek-R1-SkyT1-Flash-Lightest-32B-mlx-4Bit Text Generation • Updated Mar 14 • 936 • 8