Collections
Discover the best community collections!
Collections including paper arxiv:2309.10668
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 90 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 33 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 17 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 35
-
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Paper • 2311.00430 • Published • 53 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper • 2307.01952 • Published • 73 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 79 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 2
-
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 50 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 37 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 69 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 79
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 77 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 57 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 79