SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling Paper • 2410.12481 • Published Oct 16, 2024
MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces Paper • 2502.07709 • Published Feb 11
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting Paper • 2410.19920 • Published Oct 25, 2024
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 3 days ago • 139
Running 2.44k 2.44k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15, 2024 • 21
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15, 2024 • 21
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent By qgallouedec and 3 others • Apr 22, 2024 • 80