Update README.md
Browse filesAdd intro and reference, correct one typo
README.md
CHANGED
@@ -18,6 +18,11 @@
|
|
18 |
# ##############################################################################################
|
19 |
-->
|
20 |
|
|
|
|
|
|
|
|
|
|
|
21 |
# How to run Megatron BERT using Transformers
|
22 |
|
23 |
## Prerequisites
|
@@ -44,7 +49,7 @@ You must create a directory called `nvidia/megatron-bert-uncased-345m`.
|
|
44 |
mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
|
45 |
```
|
46 |
|
47 |
-
You can download the checkpoint from the NVIDIA GPU Cloud (NGC). For that you
|
48 |
have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
|
49 |
Cloud (NGC) Registry CLI. Further documentation for downloading models can be
|
50 |
found in the [NGC
|
@@ -52,20 +57,16 @@ documentation](https://docs.nvidia.com/dgx/ngc-registry-cli-user-guide/index.htm
|
|
52 |
|
53 |
Alternatively, you can directly download the checkpoint using:
|
54 |
|
55 |
-
### BERT 345M uncased
|
56 |
-
|
57 |
```
|
58 |
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
|
59 |
```
|
60 |
|
61 |
## Converting the checkpoint
|
62 |
|
63 |
-
In order to be loaded into `Transformers`, the checkpoint
|
64 |
Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
|
65 |
You can move those files to different directories if needed.
|
66 |
|
67 |
-
### BERT 345M uncased
|
68 |
-
|
69 |
```
|
70 |
python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
|
71 |
```
|
18 |
# ##############################################################################################
|
19 |
-->
|
20 |
|
21 |
+
[Megatron](https://arxiv.org/pdf/1909.08053.pdf) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This particular Megatron model was trained from a bidirectional transformer in the style of BERT with text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories. This model contains 345 million parameters. It is made up of 24 layers, 16 attention heads with a hidden size of 1024.
|
22 |
+
|
23 |
+
Find more information at [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
|
24 |
+
|
25 |
+
|
26 |
# How to run Megatron BERT using Transformers
|
27 |
|
28 |
## Prerequisites
|
49 |
mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
|
50 |
```
|
51 |
|
52 |
+
You can download the checkpoint from the [NVIDIA GPU Cloud (NGC)](https://ngc.nvidia.com/catalog/models/nvidia:megatron_bert_345m). For that you
|
53 |
have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
|
54 |
Cloud (NGC) Registry CLI. Further documentation for downloading models can be
|
55 |
found in the [NGC
|
57 |
|
58 |
Alternatively, you can directly download the checkpoint using:
|
59 |
|
|
|
|
|
60 |
```
|
61 |
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
|
62 |
```
|
63 |
|
64 |
## Converting the checkpoint
|
65 |
|
66 |
+
In order to be loaded into `Transformers`, the checkpoint has to be converted. You should run the following commands for that purpose.
|
67 |
Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
|
68 |
You can move those files to different directories if needed.
|
69 |
|
|
|
|
|
70 |
```
|
71 |
python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
|
72 |
```
|