101 Mixture-of-Depths: Dynamically allocating compute in transformer-based language models · 6 authors 5
46 Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models · 11 authors 9
19 ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline · 12 authors 2