Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 8 days ago • 31
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 16 days ago • 58
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 26 days ago • 130
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 26 days ago • 182
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 26 days ago • 97
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published 28 days ago • 14
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 143
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published Feb 5 • 43
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 14 items • Updated 7 days ago • 108
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published Jan 30 • 56