nvidia
/

megatron-bert-uncased-345m

Model card Files Files and versions Community

michwang commited on Apr 23, 2021

Commit

01daef4

•

1 Parent(s): 388e725

Update README.md

Browse files

Add intro and reference, correct one typo

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -18,6 +18,11 @@
 # ##############################################################################################
 -->
 # How to run Megatron BERT using Transformers
 ## Prerequisites
@@ -44,7 +49,7 @@ You must create a directory called `nvidia/megatron-bert-uncased-345m`.
 mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
 ```
-You can download the checkpoint from the NVIDIA GPU Cloud (NGC). For that you
 have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
 Cloud (NGC) Registry CLI.  Further documentation for downloading models can be
 found in the [NGC
@@ -52,20 +57,16 @@ documentation](https://docs.nvidia.com/dgx/ngc-registry-cli-user-guide/index.htm
 Alternatively, you can directly download the checkpoint using:
-### BERT 345M uncased
 ```
 wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
 ```
 ## Converting the checkpoint
-In order to be loaded into `Transformers`, the checkpoint have to be converted. You should run the following commands for that purpose.
 Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
 You can move those files to different directories if needed.
-### BERT 345M uncased
 ```
 python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
 ```

 # ##############################################################################################
 -->
+[Megatron](https://arxiv.org/pdf/1909.08053.pdf) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This particular Megatron model was trained from a bidirectional transformer in the style of BERT with text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories. This model contains 345 million parameters. It is made up of 24 layers, 16 attention heads with a hidden size of 1024.
+Find more information at [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
 # How to run Megatron BERT using Transformers
 ## Prerequisites
 mkdir -p $MYDIR/nvidia/megatron-bert-uncased-345m
 ```
+You can download the checkpoint from the [NVIDIA GPU Cloud (NGC)](https://ngc.nvidia.com/catalog/models/nvidia:megatron_bert_345m). For that you
 have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
 Cloud (NGC) Registry CLI.  Further documentation for downloading models can be
 found in the [NGC
 Alternatively, you can directly download the checkpoint using:
 ```
 wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_bert_345m/versions/v0.1_uncased/zip -O $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
 ```
 ## Converting the checkpoint
+In order to be loaded into `Transformers`, the checkpoint has to be converted. You should run the following commands for that purpose.
 Those commands will create `config.json` and `pytorch_model.bin` in `$MYDIR/nvidia/megatron-bert-{cased,uncased}-345m`.
 You can move those files to different directories if needed.
 ```
 python3 $MYDIR/transformers/src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py $MYDIR/nvidia/megatron-bert-uncased-345m/checkpoint.zip
 ```