-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 15 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 35 -
Attention Is All You Need
Paper • 1706.03762 • Published • 41
Collections
Discover the best community collections!
Collections including paper arxiv:2307.03172
-
Attention Is All You Need
Paper • 1706.03762 • Published • 41 -
You Only Look Once: Unified, Real-Time Object Detection
Paper • 1506.02640 • Published • 1 -
HEp-2 Cell Image Classification with Deep Convolutional Neural Networks
Paper • 1504.02531 • Published -
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Paper • 2401.05566 • Published • 25
-
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 29 -
Attention Is All You Need
Paper • 1706.03762 • Published • 41 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 45 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 35
-
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 35 -
Efficient Estimation of Word Representations in Vector Space
Paper • 1301.3781 • Published • 6 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 41
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
Attention Is All You Need
Paper • 1706.03762 • Published • 41 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 29 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 45 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 35
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Paper • 2310.15494 • Published • 1 -
A Long Way to Go: Investigating Length Correlations in RLHF
Paper • 2310.03716 • Published • 9 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 65 -
Giraffe: Adventures in Expanding Context Lengths in LLMs
Paper • 2308.10882 • Published • 1
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 86