jdemouth commited on
Commit
115a6a3
1 Parent(s): 89445bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -70,6 +70,22 @@ You can move those files to different directories if needed.
70
  python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
71
  ```
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ## Text generation
74
 
75
  The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
@@ -118,22 +134,6 @@ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_
118
  python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
119
  ```
120
 
121
- As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
122
- script and you're getting an exception:
123
- ```
124
- ModuleNotFoundError: No module named 'megatron.model.enums'
125
- ```
126
- you need to tell python where to find the clone of Megatron-LM, e.g.:
127
- ```
128
- cd /tmp
129
- git clone https://github.com/NVIDIA/Megatron-LM
130
- PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
131
- ```
132
- Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
133
-
134
- If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
135
- you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
136
-
137
  3. Fetch missing files
138
  ```
139
  git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
70
  python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
71
  ```
72
 
73
+ As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
74
+ script and you're getting an exception:
75
+ ```
76
+ ModuleNotFoundError: No module named 'megatron.model.enums'
77
+ ```
78
+ you need to tell python where to find the clone of Megatron-LM, e.g.:
79
+ ```
80
+ cd /tmp
81
+ git clone https://github.com/NVIDIA/Megatron-LM
82
+ PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
83
+ ```
84
+ Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
85
+
86
+ If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
87
+ you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
88
+
89
  ## Text generation
90
 
91
  The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
134
  python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
135
  ```
136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
  3. Fetch missing files
138
  ```
139
  git clone https://huggingface.co/nvidia/megatron-gpt2-345m/