Converting Tensorflow Checkpoints¶
A command-line interface is provided to convert a TensorFlow checkpoint in a PyTorch dump of the
BertForPreTraining class (for BERT) or NumPy checkpoint in a PyTorch dump of the
OpenAIGPTModel class (for OpenAI GPT).
This CLI takes as input a TensorFlow checkpoint (three files starting with
bert_model.ckpt) and the associated configuration file (
bert_config.json), and creates a PyTorch model for this configuration, loads the weights from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that can be imported using
torch.load() (see examples in run_bert_extract_features.py, run_bert_classifier.py and run_bert_squad.py).
You only need to run this conversion script once to get a PyTorch model. You can then disregard the TensorFlow checkpoint (the three files starting with
bert_model.ckpt) but be sure to keep the configuration file (
bert_config.json) and the vocabulary file (
vocab.txt) as these are needed for the PyTorch model too.
To run this specific conversion script you will need to have TensorFlow and PyTorch installed (
pip install tensorflow). The rest of the repository only requires PyTorch.
Here is an example of the conversion process for a pre-trained
BERT-Base Uncased model:
export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12 pytorch_transformers bert \ $BERT_BASE_DIR/bert_model.ckpt \ $BERT_BASE_DIR/bert_config.json \ $BERT_BASE_DIR/pytorch_model.bin
You can download Google’s pre-trained models for the conversion here.
Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint save as the same format than OpenAI pretrained model (see here)
export OPENAI_GPT_CHECKPOINT_FOLDER_PATH=/path/to/openai/pretrained/numpy/weights pytorch_transformers gpt \ $OPENAI_GPT_CHECKPOINT_FOLDER_PATH \ $PYTORCH_DUMP_OUTPUT \ [OPENAI_GPT_CONFIG]
Here is an example of the conversion process for a pre-trained Transformer-XL model (see here)
export TRANSFO_XL_CHECKPOINT_FOLDER_PATH=/path/to/transfo/xl/checkpoint pytorch_transformers transfo_xl \ $TRANSFO_XL_CHECKPOINT_FOLDER_PATH \ $PYTORCH_DUMP_OUTPUT \ [TRANSFO_XL_CONFIG]
Here is an example of the conversion process for a pre-trained OpenAI’s GPT-2 model.
export GPT2_DIR=/path/to/gpt2/checkpoint pytorch_transformers gpt2 \ $GPT2_DIR/model.ckpt \ $PYTORCH_DUMP_OUTPUT \ [GPT2_CONFIG]
Here is an example of the conversion process for a pre-trained XLNet model, fine-tuned on STS-B using the TensorFlow script:
export TRANSFO_XL_CHECKPOINT_PATH=/path/to/xlnet/checkpoint export TRANSFO_XL_CONFIG_PATH=/path/to/xlnet/config pytorch_transformers xlnet \ $TRANSFO_XL_CHECKPOINT_PATH \ $TRANSFO_XL_CONFIG_PATH \ $PYTORCH_DUMP_OUTPUT \ STS-B \