OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published 14 days ago • 73
Efficient Model Development through Fine-tuning Transfer Paper • 2503.20110 • Published 29 days ago • 4
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published 27 days ago • 43
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published 25 days ago • 46
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 23 days ago • 53
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 23 days ago • 62
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Paper • 2503.16252 • Published Mar 20 • 27
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 15 days ago • 622k • 1.32k
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 105