ChessGPT: Bridging Policy Learning and Language Modeling Paper • 2306.09200 • Published Jun 15, 2023 • 9
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 79
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 26
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 165
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 233
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Paper • 2307.15217 • Published Jul 27, 2023 • 34
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 93
Agents: An Open-source Framework for Autonomous Language Agents Paper • 2309.07870 • Published Sep 14, 2023 • 39
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers Paper • 2309.08532 • Published Sep 15, 2023 • 50
PDFTriage: Question Answering over Long, Structured Documents Paper • 2309.08872 • Published Sep 16, 2023 • 51
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper • 2309.10150 • Published Sep 18, 2023 • 23
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Paper • 2309.12307 • Published Sep 21, 2023 • 82
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models Paper • 2309.14717 • Published Sep 26, 2023 • 43
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 19
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 27
RoFormer: Enhanced Transformer with Rotary Position Embedding Paper • 2104.09864 • Published Apr 20, 2021 • 7
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Paper • 2101.00027 • Published Dec 31, 2020 • 6
Fast Transformer Decoding: One Write-Head is All You Need Paper • 1911.02150 • Published Nov 6, 2019 • 5