Kimina Prover Preview Collection State-of-the-Art Models for Formal Mathematical Reasoning • 4 items • Updated 9 days ago • 26
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 230
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 107 items • Updated 10 days ago • 99
SmartPlay : A Benchmark for LLMs as Intelligent Agents Paper • 2310.01557 • Published Oct 2, 2023 • 13
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning Paper • 2309.15091 • Published Sep 26, 2023 • 33
SCREWS: A Modular Framework for Reasoning with Revisions Paper • 2309.13075 • Published Sep 20, 2023 • 17
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 78