DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting Paper • 2503.00784 • Published Mar 2 • 12
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning Paper • 2402.05808 • Published Feb 8, 2024
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 18 days ago • 607k • 1.33k
CritiQ: Mining Data Quality Criteria from Human Preferences Paper • 2502.19279 • Published Feb 26 • 9
CritiQ: Mining Data Quality Criteria from Human Preferences Paper • 2502.19279 • Published Feb 26 • 9
CritiQ: Mining Data Quality Criteria from Human Preferences Paper • 2502.19279 • Published Feb 26 • 9 • 2
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Feb 26 • 601
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments Paper • 2406.04151 • Published Jun 6, 2024 • 20