Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1810.04805

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 1

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
nlpaueb/legal-bert-base-uncased

Fill-Mask • Updated Apr 28, 2022 • 716k • • 259

Papers I Have Read

A list of papers that have moved off my reading list

Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face

Paper • 2401.13822 • Published Jan 24, 2024 • 1
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 69
HuggingFace's Transformers: State-of-the-art Natural Language Processing

Paper • 1910.03771 • Published Oct 9, 2019 • 19
Model Cards for Model Reporting

Paper • 1810.03993 • Published Oct 5, 2018 • 5

Running

MCP

2.13k

2.13k

Anycoder

🏢

Redesign websites with modern layouts
Running

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

921

921

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.3k

13.3k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 69
Playing Atari with Deep Reinforcement Learning

Paper • 1312.5602 • Published Dec 19, 2013
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 14

FineTuning Papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
Deep Residual Learning for Image Recognition

Paper • 1512.03385 • Published Dec 10, 2015 • 6

CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published Oct 23, 2024 • 210
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published Oct 30, 2024 • 56
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 52

This is information about BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
jjzha/skillspan

Viewer • Updated Sep 7, 2023 • 11.5k • 122 • 1
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents

Paper • 2502.05957 • Published Feb 9 • 16

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 1

FineTuning Papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
nlpaueb/legal-bert-base-uncased

Fill-Mask • Updated Apr 28, 2022 • 716k • • 259

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19

Papers I Have Read

A list of papers that have moved off my reading list

Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face

Paper • 2401.13822 • Published Jan 24, 2024 • 1
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 69
HuggingFace's Transformers: State-of-the-art Natural Language Processing

Paper • 1910.03771 • Published Oct 9, 2019 • 19
Model Cards for Model Reporting

Paper • 1810.03993 • Published Oct 5, 2018 • 5

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
Deep Residual Learning for Image Recognition

Paper • 1512.03385 • Published Dec 10, 2015 • 6

Running

MCP

2.13k

2.13k

Anycoder

🏢

Redesign websites with modern layouts
Running

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

921

921

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.3k

13.3k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published Oct 23, 2024 • 210
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published Oct 30, 2024 • 56
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 52

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 69
Playing Atari with Deep Reinforcement Learning

Paper • 1312.5602 • Published Dec 19, 2013
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 14

This is information about BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
jjzha/skillspan

Viewer • Updated Sep 7, 2023 • 11.5k • 122 • 1
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents

Paper • 2502.05957 • Published Feb 9 • 16

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs