ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent Paper • 2312.10003 • Published Dec 15, 2023 • 36
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision Paper • 2312.09390 • Published Dec 14, 2023 • 32
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47