SMOTE: Synthetic Minority Over-sampling Technique Paper • 1106.1813 • Published Jun 9, 2011 • 1
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Paper • 1406.1078 • Published Jun 3, 2014
Distributed Representations of Sentences and Documents Paper • 1405.4053 • Published May 16, 2014
Sequence to Sequence Learning with Neural Networks Paper • 1409.3215 • Published Sep 10, 2014 • 3
Neural Machine Translation by Jointly Learning to Align and Translate Paper • 1409.0473 • Published Sep 1, 2014 • 2
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Paper • 1804.07461 • Published Apr 20, 2018 • 4
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 11
RoBERTa: A Robustly Optimized BERT Pretraining Approach Paper • 1907.11692 • Published Jul 26, 2019 • 7
Energy and Policy Considerations for Deep Learning in NLP Paper • 1906.02243 • Published Jun 5, 2019 • 1
XLNet: Generalized Autoregressive Pretraining for Language Understanding Paper • 1906.08237 • Published Jun 19, 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Paper • 1910.01108 • Published Oct 2, 2019 • 9
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Paper • 1910.10683 • Published Oct 23, 2019 • 6
AR-Net: A simple Auto-Regressive Neural Network for time-series Paper • 1911.12436 • Published Nov 27, 2019 • 1
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators Paper • 2003.10555 • Published Mar 23, 2020
SQuAD: 100,000+ Questions for Machine Comprehension of Text Paper • 1606.05250 • Published Jun 16, 2016 • 3
Mish: A Self Regularized Non-Monotonic Activation Function Paper • 1908.08681 • Published Aug 23, 2019 • 1
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Paper • 2101.00027 • Published Dec 31, 2020 • 6
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Paper • 2101.03961 • Published Jan 11, 2021 • 13
Learning Transferable Visual Models From Natural Language Supervision Paper • 2103.00020 • Published Feb 26, 2021 • 7
LoRA: Low-Rank Adaptation of Large Language Models Paper • 2106.09685 • Published Jun 17, 2021 • 24
Evaluating Large Language Models Trained on Code Paper • 2107.03374 • Published Jul 7, 2021 • 6
NeuralProphet: Explainable Forecasting at Scale Paper • 2111.15397 • Published Nov 29, 2021 • 1
LLaMA: Open and Efficient Foundation Language Models Paper • 2302.13971 • Published Feb 27, 2023 • 11