papers Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 15 APPO: Agentic Procedural Policy Optimization Paper • 2606.12384 • Published 14 days ago • 77
Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 15
papers Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 15 APPO: Agentic Procedural Policy Optimization Paper • 2606.12384 • Published 14 days ago • 77
Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 15