michwang commited on
Commit
01daef4
1 Parent(s): 388e725

Update README.md

Browse files

Add intro and reference, correct one typo

Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -18,6 +18,11 @@
18
  # ##############################################################################################
19
  -->
20
 
 
 
 
 
 
21
  # How to run Megatron BERT using Transformers
22
 
23
  ## Prerequisites
@@ -44,7 +49,7 @@ You must create a directory called `nvidia/megatron-bert-uncased-345m`.
44
  mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
45
  ```
46
 
47
- You can download the checkpoint from the NVIDIA GPU Cloud (NGC). For that you
48
  have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
49
  Cloud (NGC) Registry CLI. Further documentation for downloading models can be
50
  found in the [NGC
@@ -52,20 +57,16 @@ documentation](https://docs.nvidia.com/dgx/ngc-registry-cli-user-guide/index.htm
52
 
53
  Alternatively, you can directly download the checkpoint using:
54
 
55
- ### BERT 345M uncased
56
-
57
  ```
58
  wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
59
  ```
60
 
61
  ## Converting the checkpoint
62
 
63
- In order to be loaded into `Transformers`, the checkpoint have to be converted. You should run the following commands for that purpose.
64
  Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
65
  You can move those files to different directories if needed.
66
 
67
- ### BERT 345M uncased
68
-
69
  ```
70
  python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
71
  ```
18
  # ##############################################################################################
19
  -->
20
 
21
+ [Megatron](https://arxiv.org/pdf/1909.08053.pdf) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This particular Megatron model was trained from a bidirectional transformer in the style of BERT with text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories. This model contains 345 million parameters. It is made up of 24 layers, 16 attention heads with a hidden size of 1024.
22
+
23
+ Find more information at [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
24
+
25
+
26
  # How to run Megatron BERT using Transformers
27
 
28
  ## Prerequisites
49
  mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
50
  ```
51
 
52
+ You can download the checkpoint from the [NVIDIA GPU Cloud (NGC)](https://ngc.nvidia.com/catalog/models/nvidia:megatron_bert_345m). For that you
53
  have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
54
  Cloud (NGC) Registry CLI. Further documentation for downloading models can be
55
  found in the [NGC
57
 
58
  Alternatively, you can directly download the checkpoint using:
59
 
 
 
60
  ```
61
  wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
62
  ```
63
 
64
  ## Converting the checkpoint
65
 
66
+ In order to be loaded into `Transformers`, the checkpoint has to be converted. You should run the following commands for that purpose.
67
  Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
68
  You can move those files to different directories if needed.
69
 
 
 
70
  ```
71
  python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
72
  ```