Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset Paper • 2403.09029 • Published Mar 14 • 52
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Paper • 2403.12968 • Published Mar 19 • 20
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 54
Simple and Scalable Strategies to Continually Pre-train Large Language Models Paper • 2403.08763 • Published Mar 13 • 48
Language models scale reliably with over-training and on downstream tasks Paper • 2403.08540 • Published Mar 13 • 13
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 49
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 30
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21 • 50
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions Paper • 2403.15246 • Published Mar 22 • 8
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Paper • 2403.18421 • Published Mar 27 • 20
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models Paper • 2404.03543 • Published Apr 4 • 15
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3 • 46
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published 24 days ago • 49
Flamingo: a Visual Language Model for Few-Shot Learning Paper • 2204.14198 • Published Apr 29, 2022 • 13