stas commited on
Commit
10c4153
1 Parent(s): 0efaa23

add simpler instructions on how to put this model together

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md CHANGED
@@ -104,6 +104,33 @@ for sentence in output:
104
  print(text)
105
  ```
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  # Original code
108
 
109
  The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).
104
  print(text)
105
  ```
106
 
107
+ # To use this as a normal HuggingFace model
108
+
109
+ If you want to use this model with HF Trainer, here is a quick way to do that:
110
+
111
+ 1. Download nvidia checkpoint:
112
+ ```
113
+ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_lm_345m/versions/v0.0/zip -O megatron_lm_345m_v0.0.zip
114
+ ```
115
+
116
+ 2. Convert:
117
+ ```
118
+ python /src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
119
+ ```
120
+
121
+ 3. Fetch missing files
122
+ ```
123
+ git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
124
+ ```
125
+
126
+ 4. Move the converted files into the cloned model dir
127
+ ```
128
+ mv config.json pytorch_model.bin megatron-gpt2-345m/
129
+ ```
130
+
131
+ 5. The `megatron-gpt2-345m` dir should now have all the files which can be passed to HF Trainer as `--model_name_or_path megatron-gpt2-345m`
132
+
133
+
134
  # Original code
135
 
136
  The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).