nvidia
/

nemo-megatron-gpt-1.3B

Text2Text Generation

Model card Files Files and versions Community

okuchaiev commited on Sep 13, 2022

Commit

9a37149

•

1 Parent(s): baefef7

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -31,8 +31,11 @@ Megatron-GPT 1.3B is a transformer-based language model. GPT refers to a class o
 This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
 ## Getting started
 You will need to install NVIDIA Apex and NeMo.
 ```
@@ -48,6 +51,18 @@ pip install nemo_toolkit['nlp']==1.11.0
 Alternatively, you can use NeMo Megatron training docker container with all dependencies pre-installed.
 ## Training Data
 The model was trained on ["The Piles" dataset prepared by Eleuther.AI](https://pile.eleuther.ai/).

 This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
 ## Getting started
+### Step 1: Install NeMo and dependencies
 You will need to install NVIDIA Apex and NeMo.
 ```
 Alternatively, you can use NeMo Megatron training docker container with all dependencies pre-installed.
+### Step 2: Launch eval server
+**Note.** The model has been trained with Tensor Parallelism (TP) of 1 and Pipeline Parallelism (PP) of 1 and should fit on a single NVIDIA GPU.
+```
+git clone https://github.com/NVIDIA/NeMo.git
+cd NeMo/examples/nlp/language_modeling
+git checkout v1.11.0
+python megatron_gpt_eval.py gpt_model_file=nemo_gpt5B_fp16.nemo server=True tensor_model_parallel_size=1 trainer.devices=1
+```
 ## Training Data
 The model was trained on ["The Piles" dataset prepared by Eleuther.AI](https://pile.eleuther.ai/).