YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 57 • 4
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 79 • 14
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 Paper • 2306.02707 • Published Jun 5, 2023 • 45 • 18