Universal Language Model Fine-tuning for Text Classification Paper • 1801.06146 • Published Jan 18, 2018 • 6
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 14
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 11
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing Paper • 1808.06226 • Published Aug 19, 2018 • 1