Introduction

Welcome to the 🤗 Course!

This course will teach you about natural language processing (NLP) using libraries from the Hugging Face ecosystem — 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizers, and 🤗 Accelerate — as well as the Hugging Face Hub. It’s completely free and without ads.

What to expect?

Here is a brief overview of the course:

Brief overview of the chapters of the course.
  • Chapters 1 to 4 provide an introduction to the main concepts of the 🤗 Transformers library. By the end of this part of the course, you will be familiar with how Transformer models work and will know how to use a model from the Hugging Face Hub, fine-tune it on a dataset, and share your results on the Hub!
  • Chapters 5 to 8 teach the basics of 🤗 Datasets and 🤗 Tokenizers before diving into classic NLP tasks. By the end of this part, you will be able to tackle the most common NLP problems by yourself.
  • Chapters 9 to 12 dive even deeper, showcasing specialized architectures (memory efficiency, long sequences, etc.) and teaching you how to write custom objects for more exotic use cases. By the end of this part, you will be ready to solve complex NLP problems and make meaningful contributions to 🤗 Transformers.

This course:

After you’ve completed this course, we recommend checking out DeepLearning.AI’s Natural Language Processing Specialization, which covers a wide range of traditional NLP models like naive Bayes and LSTMs that are well worth knowing about!

Who are we?

About the authors:

Matthew Carrigan is a Machine Learning Engineer at Hugging Face. He lives in Dublin, Ireland and previously worked as an ML engineer at Parse.ly and before that as a post-doctoral researcher at Trinity College Dublin. He does not believe we’re going to get to AGI by scaling existing architectures, but has high hopes for robot immortality regardless.

Lysandre Debut is a Machine Learning Engineer at Hugging Face and has been working on the 🤗 Transformers library since the very early development stages. His aim is to make NLP accessible for everyone by developing tools with a very simple API.

Sylvain Gugger is a Research Engineer at Hugging Face and one of the core maintainers of the 🤗 Transformers library. Previously he was a Research Scientist at fast.ai, and he co-wrote Deep Learning for Coders with fastai and PyTorch with Jeremy Howard. The main focus of his research is on making deep learning more accessible, by designing and improving techniques that allow models to train fast on limited resources.

Merve Noyan is a developer advocate at Hugging Face, working on developing tools and building content around them to democratize machine learning for everyone.

Lucile Saulnier is a machine learning engineer at Hugging Face, developing and supporting the use of open source tools. She is also actively involved in many research projects in the field of Natural Language Processing such as collaborative training and BigScience.

Lewis Tunstall is a machine learning engineer at Hugging Face, focused on developing open-source tools and making them accessible to the wider community. He is also a co-author of an upcoming O’Reilly book on Transformers.

Leandro von Werra is a machine learning engineer in the open-source team at Hugging Face and also a co-author of the an upcoming O’Reilly book on Transformers. He has several years of industry experience bringing NLP projects to production by working across the whole machine learning stack..

Are you ready to roll? In this chapter, you will learn:

  • How to use the pipeline() function to solve NLP tasks such as text generation and classification
  • About the Transformer architecture
  • How to distinguish between encoder, decoder, and encoder-decoder architectures and use cases