-
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 25 -
Attention Is All You Need
Paper • 1706.03762 • Published • 36 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 38 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 33
Collections
Discover the best community collections!
Collections including paper arxiv:2305.18290
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 83 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 13 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 55 -
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 25
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 55 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 235 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 25 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 41
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 38 -
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Paper • 2306.01693 • Published • 2 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 135 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 23
-
Cognitive Architectures for Language Agents
Paper • 2309.02427 • Published • 2 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 38 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 2
-
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Paper • 2310.01352 • Published • 6 -
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Paper • 2203.11171 • Published • 1 -
MemGPT: Towards LLMs as Operating Systems
Paper • 2310.08560 • Published • 6 -
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
Paper • 2310.06117 • Published • 3
-
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 117 -
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 117 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 38 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 38
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 31 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 8 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 38
-
Attention Is All You Need
Paper • 1706.03762 • Published • 36 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 12 -
Universal Language Model Fine-tuning for Text Classification
Paper • 1801.06146 • Published • 6 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 10