-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 114 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28
Collections
Discover the best community collections!
Collections including paper arxiv:2412.14689
-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 80 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 39 -
OpenAI o1 System Card
Paper • 2412.16720 • Published • 28 -
Revisiting In-Context Learning with Long Context Language Models
Paper • 2412.16926 • Published • 23
-
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 48 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 47 -
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Paper • 2412.14171 • Published • 23 -
The Open Source Advantage in Large Language Models (LLMs)
Paper • 2412.12004 • Published • 9
-
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper • 2412.14475 • Published • 52 -
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 48 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 39 -
WavePulse: Real-time Content Analytics of Radio Livestreams
Paper • 2412.17998 • Published • 9
-
MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation
Paper • 2412.07147 • Published • 5 -
Grounding Descriptions in Images informs Zero-Shot Visual Recognition
Paper • 2412.04429 • Published -
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models
Paper • 2412.05939 • Published • 13 -
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions
Paper • 2412.08737 • Published • 52
-
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Paper • 2411.11504 • Published • 19 -
Top-nσ: Not All Logits Are You Need
Paper • 2411.07641 • Published • 18 -
Adaptive Decoding via Latent Preference Optimization
Paper • 2411.09661 • Published • 10 -
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Paper • 2411.13476 • Published • 15