Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 4 days ago • 58
ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published 13 days ago • 74
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published 16 days ago • 241
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 15 days ago • 78
view article Article Guiding Text Generation with Constrained Beam Search in 🤗 Transformers Mar 11, 2022 • 11
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 29 days ago • 117
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7 • 117
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 228
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 23 days ago • 114
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 103
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 142
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 190
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 224