In Chapter 3, you saw how to fine-tune a model for text classification. In this chapter, we will tackle the following common NLP tasks:

  • Token classification
  • Masked language modeling (like BERT)
  • Summarization
  • Translation
  • Causal language modeling pretraining (like GPT-2)
  • Question answering

To do this, you’ll need to leverage everything you learned about training models with the Keras API in Chapter 3, the 🤗 Datasets library in Chapter 5, and the 🤗 Tokenizers library in Chapter 6. We’ll also upload our results to the Model Hub, like we did in Chapter 4, so this is really the chapter where everything comes together!

Each section can be read independently.

If you read the sections in sequence, you will notice that they have quite a bit of code and prose in common. The repetition is intentional, to allow you to dip in (or come back later) to any task that interests you and find a complete working example.