Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Paper • 2407.21787 • Published Jul 31 • 12
Solving math word problems with process- and outcome-based feedback Paper • 2211.14275 • Published Nov 25, 2022 • 7
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6 • 51
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 76
EXAONE-3.5 Collection EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B. • 10 items • Updated 17 days ago • 81
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published 27 days ago • 55
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published about 1 month ago • 47
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published Nov 21 • 58
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15 • 68
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19 • 47
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7 • 111
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 126
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse Paper • 2410.21333 • Published Oct 27 • 10
Counting Ability of Large Language Models and Impact of Tokenization Paper • 2410.19730 • Published Oct 25 • 10
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published Oct 18 • 36