Stop Regressing: Training Value Functions via Classification for Scalable Deep RL Paper • 2403.03950 • Published Mar 6 • 11
GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models Paper • 2306.13649 • Published Jun 23, 2023 • 10