Can GRPO Boost Complex Multimodal Table Understanding? Paper • 2509.16889 • Published Sep 21, 2025 • 2
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning Paper • 2606.13106 • Published 14 days ago • 21