Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 4 days ago • 89
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 3 days ago • 240
view article Article HuggingFace, IISc partner to supercharge model building on India's diverse languages 16 days ago • 14
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 11 days ago • 65
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 8 days ago • 79
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 7 days ago • 72
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 70
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) Dec 9, 2022 • 198
SimpleRL Collection The collection for the Project "Simple Reinforcement Learning for Reasoning" • 2 items • Updated 24 days ago • 5
CodeI/O Collection Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated 30 days ago • 6
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated Feb 10 • 76
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published Feb 5 • 43
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 124
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published Feb 10 • 60
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 142