Reinforced Self-Training (ReST) for Language Modeling Paper • 2308.08998 • Published Aug 17, 2023 • 2
Efficiently Modeling Long Sequences with Structured State Spaces Paper • 2111.00396 • Published Oct 31, 2021 • 1
Hyena Hierarchy: Towards Larger Convolutional Language Models Paper • 2302.10866 • Published Feb 21, 2023 • 7
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 170
Improving language models by retrieving from trillions of tokens Paper • 2112.04426 • Published Dec 8, 2021 • 1
Unlimiformer: Long-Range Transformers with Unlimited Length Input Paper • 2305.01625 • Published May 2, 2023 • 6