LysandreJik commited on
Commit
b214118
1 Parent(s): a41f8a7

README & Tokenizer

Browse files
Files changed (4) hide show
  1. README.md +105 -0
  2. merges.txt +0 -0
  3. tokenizer.json +0 -0
  4. vocab.json +0 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!---
2
+ # ##############################################################################################
3
+ #
4
+ # Copyright (c) 2021-, NVIDIA CORPORATION. All rights reserved.
5
+ #
6
+ # Licensed under the Apache License, Version 2.0 (the "License");
7
+ # you may not use this file except in compliance with the License.
8
+ # You may obtain a copy of the License at
9
+ #
10
+ # http://www.apache.org/licenses/LICENSE-2.0
11
+ #
12
+ # Unless required by applicable law or agreed to in writing, software
13
+ # distributed under the License is distributed on an "AS IS" BASIS,
14
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15
+ # See the License for the specific language governing permissions and
16
+ # limitations under the License.
17
+ #
18
+ # ##############################################################################################
19
+ -->
20
+
21
+ # How to run Megatron GPT2 using Transformers
22
+
23
+ ## Prerequisites
24
+
25
+ In that guide, we run all the commands from a folder called `$MYDIR` and defined as (in `bash`):
26
+
27
+ ```
28
+ export MYDIR=$HOME
29
+ ```
30
+
31
+ Feel free to change the location at your convenience.
32
+
33
+ To run some of the commands below, you'll have to clone `Transformers`.
34
+
35
+ ```
36
+ git clone https://github.com/huggingface/transformers.git $MYDIR/transformers
37
+ ```
38
+
39
+ ## Get the checkpoints from the NVIDIA GPU Cloud
40
+
41
+ You must create a directory called `nvidia/megatron-gpt2-345m`:
42
+
43
+ ```
44
+ mkdir -p $MYDIR/nvidia/megatron-gpt2-345m
45
+ ```
46
+
47
+ You can download the checkpoints from the NVIDIA GPU Cloud (NGC). For that you
48
+ have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
49
+ Cloud (NGC) Registry CLI. Further documentation for downloading models can be
50
+ found in the [NGC
51
+ documentation](https://docs.nvidia.com/dgx/ngc-registry-cli-user-guide/index.html#topic_6_4_1).
52
+
53
+ Alternatively, you can directly download the checkpoints using:
54
+
55
+ ```
56
+ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_lm_345m/versions/v0.0/zip -O $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
57
+ ```
58
+
59
+ ## Converting the checkpoint
60
+
61
+ In order to be loaded into `Transformers`, the checkpoint has to be converted. You should run the following command for that purpose.
62
+ That command will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-gpt2-345m`.
63
+ You can move those files to different directories if needed.
64
+
65
+ ```
66
+ python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
67
+ ```
68
+
69
+ ## Text generation
70
+
71
+ The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
72
+
73
+ ```
74
+ import os
75
+ import torch
76
+
77
+ from transformers import GPT2Tokenizer, GPT2LMHeadModel
78
+
79
+ # The tokenizer. Megatron was trained with standard tokenizer(s).
80
+ tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
81
+ # The path to the config/checkpoint (see the conversion step above).
82
+ directory = os.path.join(os.environ['MYDIR'], 'nvidia/megatron-gpt2-345m')
83
+ # Load the model from $MYDIR/nvidia/megatron-gpt2-345m.
84
+ model = GPT2LMHeadModel.from_pretrained(directory)
85
+
86
+ # Copy to the device and use FP16.
87
+ assert torch.cuda.is_available()
88
+ device = torch.device("cuda")
89
+ model.to(device)
90
+ model.eval()
91
+ model.half()
92
+
93
+ # Generate the sentence.
94
+ output = model.generate(input_ids=None, max_length=32, num_return_sequences=1)
95
+
96
+ # Output the text.
97
+ for sentence in output:
98
+ sentence = sentence.tolist()
99
+ text = tokenizer.decode(sentence, clean_up_tokenization_spaces=True)
100
+ print(text)
101
+ ```
102
+
103
+ # Original code
104
+
105
+ The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).
merges.txt ADDED
The diff for this file is too large to render. See raw diff
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
vocab.json ADDED
The diff for this file is too large to render. See raw diff