How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published 10 days ago • 25
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published 26 days ago • 54
FoNE: Precise Single-Token Number Embeddings via Fourier Features Paper • 2502.09741 • Published 24 days ago • 11
Aira Collection Aira is a series of chatbots developed as an experimentation playground for value alignment. • 27 items • Updated Jun 20, 2024 • 1
Loxa Collection a Loxa family models are best models to running on CPU and GPU with high quality(=>92% accuracy) • 5 items • Updated Feb 3 • 2
Quadrifoglio 🍀 Collection Small text2text models finetuned on Italian machine translation tasks. • 6 items • Updated Jan 12 • 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 134
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 53
FluidML: Fast and Memory Efficient Inference Optimization Paper • 2411.09242 • Published Nov 14, 2024 • 1
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 59
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 18 days ago • 245
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 574