Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 20 days ago • 61
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29 • 46
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 61
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers Paper • 2311.09180 • Published Nov 15, 2023 • 7
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster Paper • 2311.08263 • Published Nov 14, 2023 • 14
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion Paper • 2310.03502 • Published Oct 5, 2023 • 74
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation Paper • 2308.07286 • Published Aug 14, 2023 • 5
Platypus: Quick, Cheap, and Powerful Refinement of LLMs Paper • 2308.07317 • Published Aug 14, 2023 • 22
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Paper • 2308.06721 • Published Aug 13, 2023 • 24
OctoPack: Instruction Tuning Code Large Language Models Paper • 2308.07124 • Published Aug 14, 2023 • 27
Ambient Adventures: Teaching ChatGPT on Developing Complex Stories Paper • 2308.01734 • Published Aug 3, 2023 • 6
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models Paper • 2308.01390 • Published Aug 2, 2023 • 30
LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance Paper • 2307.00522 • Published Jul 2, 2023 • 27
Guiding Language Models of Code with Global Context using Monitors Paper • 2306.10763 • Published Jun 19, 2023 • 7
GLIMMER: generalized late-interaction memory reranker Paper • 2306.10231 • Published Jun 17, 2023 • 7
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing Paper • 2306.12929 • Published Jun 22, 2023 • 11
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference Paper • 2306.12509 • Published Jun 21, 2023 • 14
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought Paper • 2306.12672 • Published Jun 22, 2023 • 24
GPT-Calls: Enhancing Call Segmentation and Tagging by Generating Synthetic Conversations via Large Language Models Paper • 2306.07941 • Published Jun 9, 2023 • 3