Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 28 days ago • 101
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 175
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 58
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 549
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 62
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 50
StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows Paper • 2403.11322 • Published Mar 17 • 1
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 73
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting Paper • 2309.04269 • Published Sep 8, 2023 • 28
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Paper • 2307.08691 • Published Jul 17, 2023 • 6
Meta-Transformer: A Unified Framework for Multimodal Learning Paper • 2307.10802 • Published Jul 20, 2023 • 40
Challenges and Applications of Large Language Models Paper • 2307.10169 • Published Jul 19, 2023 • 46
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 8
System 2 Attention (is something you might need too) Paper • 2311.11829 • Published Nov 20, 2023 • 38
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Paper • 2201.11903 • Published Jan 28, 2022 • 7
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks Paper • 1901.00032 • Published Dec 31, 2018 • 1
SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking Paper • 2107.05720 • Published Jul 12, 2021 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 37
Prefix-Tuning: Optimizing Continuous Prompts for Generation Paper • 2101.00190 • Published Jan 1, 2021 • 3
Memory-assisted prompt editing to improve GPT-3 after deployment Paper • 2201.06009 • Published Jan 16, 2022 • 1
RAGAS: Automated Evaluation of Retrieval Augmented Generation Paper • 2309.15217 • Published Sep 26, 2023 • 3
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7 • 43
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning Paper • 2205.00445 • Published May 1, 2022 • 1
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Paper • 2202.12837 • Published Feb 25, 2022 • 1
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 567