Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1801.06146

Influential papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 10

Most influential papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 10

Literature review on transformer architecture and what followed.

Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 8
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

Paper • 1808.06226 • Published Aug 19, 2018 • 1

Transformer Arch

Checkout: https://bbycroft.net/llm and http://nlp.seas.harvard.edu/2018/04/03/attention.html

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 35
ImageNet Large Scale Visual Recognition Challenge

Paper • 1409.0575 • Published Sep 1, 2014 • 6
Sequence to Sequence Learning with Neural Networks

Paper • 1409.3215 • Published Sep 10, 2014 • 3
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 10

language-models

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 43
BloombergGPT: A Large Language Model for Finance

Paper • 2303.17564 • Published Mar 30, 2023 • 16
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 11

Interesting AI papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 10

Most influential papers in AI

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 11
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 10

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs