NousResearch/DeepHermes-3-Llama-3-8B-Preview Text Generation โข Updated 2 days ago โข 21.5k โข 296
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi โข 13 items โข Updated Sep 18, 2024 โข 227
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 โข 8 items โข Updated 20 days ago โข 398
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper โข 2501.11873 โข Published Jan 21 โข 63