Kuwain 1.5B: An Arabic SLM via Language Injection Paper β’ 2504.15120 β’ Published 2 days ago β’ 89 β’ 3
Could Thinking Multilingually Empower LLM Reasoning? Paper β’ 2504.11833 β’ Published 7 days ago β’ 25
FlowReasoner: Reinforcing Query-Level Meta-Agents Paper β’ 2504.15257 β’ Published 2 days ago β’ 41
FlowReasoner: Reinforcing Query-Level Meta-Agents Paper β’ 2504.15257 β’ Published 2 days ago β’ 41
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper β’ 2504.13055 β’ Published 6 days ago β’ 18
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper β’ 2504.13055 β’ Published 6 days ago β’ 18
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper β’ 2504.13055 β’ Published 6 days ago β’ 18 β’ 2
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types Paper β’ 2412.11757 β’ Published Dec 16, 2024
Efficient Process Reward Model Training via Active Learning Paper β’ 2504.10559 β’ Published 9 days ago β’ 13
π Active PRM Collection Efficient Process Reward Model Training via Active Learning. β’ 4 items β’ Updated 8 days ago β’ 3
Understanding R1-Zero-Like Training: A Critical Perspective Paper β’ 2503.20783 β’ Published 28 days ago β’ 45
Efficient Process Reward Model Training via Active Learning Paper β’ 2504.10559 β’ Published 9 days ago β’ 13
π Active PRM Collection Efficient Process Reward Model Training via Active Learning. β’ 4 items β’ Updated 8 days ago β’ 3
Efficient Process Reward Model Training via Active Learning Paper β’ 2504.10559 β’ Published 9 days ago β’ 13 β’ 2