BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 11
Universal Language Model Fine-tuning for Text Classification Paper • 1801.06146 • Published Jan 18, 2018 • 6
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 8
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs Paper • 1603.09320 • Published Mar 30, 2016 • 1