LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 65
Instella ✨ Collection Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs. • 5 items • Updated 27 days ago • 7
Temporal Consistency for LLM Reasoning Process Error Identification Paper • 2503.14495 • Published 14 days ago • 9
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era Paper • 2503.12329 • Published 17 days ago • 24
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 14 days ago • 110
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion Paper • 2503.16212 • Published 12 days ago • 22
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds Paper • 2503.10625 • Published 19 days ago • 28
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper • 2503.13358 • Published 15 days ago • 90
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts? Paper • 2503.18018 • Published 9 days ago • 6
Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering Paper • 2503.15879 • Published 13 days ago • 6
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models Paper • 2503.18923 • Published 8 days ago • 11
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 8 days ago • 110
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially? Paper • 2503.12349 • Published 17 days ago • 41
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 26 days ago • 88