NLP Course documentation

Introduction

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Introduction

Welcome to the Hugging Face course! This introduction will guide you through setting up a working environment. If you’re just starting the course, we recommend you first take a look at Chapter 1, then come back and set up your environment so you can try the code yourself.

All the libraries that we’ll be using in this course are available as Python packages, so here we’ll show you how to set up a Python environment and install the specific libraries you’ll need.

We’ll cover two ways of setting up your working environment, using a Colab notebook or a Python virtual environment. Feel free to choose the one that resonates with you the most. For beginners, we strongly recommend that you get started by using a Colab notebook.

Note that we will not be covering the Windows system. If you’re running on Windows, we recommend following along using a Colab notebook. If you’re using a Linux distribution or macOS, you can use either approach described here.

Most of the course relies on you having a Hugging Face account. We recommend creating one now: create an account.

Using a Google Colab notebook

Using a Colab notebook is the simplest possible setup; boot up a notebook in your browser and get straight to coding!

If you’re not familiar with Colab, we recommend you start by following the introduction. Colab allows you to use some accelerating hardware, like GPUs or TPUs, and it is free for smaller workloads.

Once you’re comfortable moving around in Colab, create a new notebook and get started with the setup:

An empty colab notebook

The next step is to install the libraries that we’ll be using in this course. We’ll use pip for the installation, which is the package manager for Python. In notebooks, you can run system commands by preceding them with the ! character, so you can install the 🤗 Transformers library as follows:

!pip install transformers

You can make sure the package was correctly installed by importing it within your Python runtime:

import transformers
A gif showing the result of the two commands above: installation and import

This installs a very light version of 🤗 Transformers. In particular, no specific machine learning frameworks (like PyTorch or TensorFlow) are installed. Since we’ll be using a lot of different features of the library, we recommend installing the development version, which comes with all the required dependencies for pretty much any imaginable use case:

!pip install transformers[sentencepiece]

This will take a bit of time, but then you’ll be ready to go for the rest of the course!

Using a Python virtual environment

If you prefer to use a Python virtual environment, the first step is to install Python on your system. We recommend following this guide to get started.

Once you have Python installed, you should be able to run Python commands in your terminal. You can start by running the following command to ensure that it is correctly installed before proceeding to the next steps: python --version. This should print out the Python version now available on your system.

When running a Python command in your terminal, such as python --version, you should think of the program running your command as the “main” Python on your system. We recommend keeping this main installation free of any packages, and using it to create separate environments for each application you work on — this way, each application can have its own dependencies and packages, and you won’t need to worry about potential compatibility issues with other applications.

In Python this is done with virtual environments, which are self-contained directory trees that each contain a Python installation with a particular Python version alongside all the packages the application needs. Creating such a virtual environment can be done with a number of different tools, but we’ll use the official Python package for that purpose, which is called venv.

First, create the directory you’d like your application to live in — for example, you might want to make a new directory called transformers-course at the root of your home directory:

mkdir ~/transformers-course
cd ~/transformers-course

From inside this directory, create a virtual environment using the Python venv module:

python -m venv .env

You should now have a directory called .env in your otherwise empty folder:

ls -a
.      ..    .env

You can jump in and out of your virtual environment with the activate and deactivate scripts:

# Activate the virtual environment
source .env/bin/activate

# Deactivate the virtual environment
source .env/bin/deactivate

You can make sure that the environment is activated by running the which python command: if it points to the virtual environment, then you have successfully activated it!

which python
/home/<user>/transformers-course/.env/bin/python

Installing dependencies

As in the previous section on using Google Colab instances, you’ll now need to install the packages required to continue. Again, you can install the development version of 🤗 Transformers using the pip package manager:

pip install "transformers[sentencepiece]"

You’re now all set up and ready to go!