Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published 19 days ago • 43
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published 19 days ago • 60
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated 4 days ago • 177
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 74
Josiefied and Abliterated Collection Abliterated, and further fine-tuned to be the most uncensored models available. • 11 items • Updated 7 days ago • 3
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 10 days ago • 274
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 20 items • Updated 4 days ago • 40
Mamba Collection Mamba is a new LLM architecture that integrates the Structured State Space sequence model to manage lengthy data sequences. • 11 items • Updated Oct 12 • 1
Transformers compatible Mamba Collection This release includes the `mamba` repositories compatible with the `transformers` library • 5 items • Updated Mar 6 • 36
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12 • 4
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 383
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 218