Update README.md
Browse files
README.md
CHANGED
@@ -31,8 +31,11 @@ Megatron-GPT 1.3B is a transformer-based language model. GPT refers to a class o
|
|
31 |
|
32 |
This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
|
33 |
|
|
|
34 |
## Getting started
|
35 |
|
|
|
|
|
36 |
You will need to install NVIDIA Apex and NeMo.
|
37 |
|
38 |
```
|
@@ -48,6 +51,18 @@ pip install nemo_toolkit['nlp']==1.11.0
|
|
48 |
|
49 |
Alternatively, you can use NeMo Megatron training docker container with all dependencies pre-installed.
|
50 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
## Training Data
|
52 |
|
53 |
The model was trained on ["The Piles" dataset prepared by Eleuther.AI](https://pile.eleuther.ai/).
|
|
|
31 |
|
32 |
This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
|
33 |
|
34 |
+
|
35 |
## Getting started
|
36 |
|
37 |
+
### Step 1: Install NeMo and dependencies
|
38 |
+
|
39 |
You will need to install NVIDIA Apex and NeMo.
|
40 |
|
41 |
```
|
|
|
51 |
|
52 |
Alternatively, you can use NeMo Megatron training docker container with all dependencies pre-installed.
|
53 |
|
54 |
+
### Step 2: Launch eval server
|
55 |
+
|
56 |
+
**Note.** The model has been trained with Tensor Parallelism (TP) of 1 and Pipeline Parallelism (PP) of 1 and should fit on a single NVIDIA GPU.
|
57 |
+
|
58 |
+
```
|
59 |
+
git clone https://github.com/NVIDIA/NeMo.git
|
60 |
+
cd NeMo/examples/nlp/language_modeling
|
61 |
+
git checkout v1.11.0
|
62 |
+
python megatron_gpt_eval.py gpt_model_file=nemo_gpt5B_fp16.nemo server=True tensor_model_parallel_size=1 trainer.devices=1
|
63 |
+
```
|
64 |
+
|
65 |
+
|
66 |
## Training Data
|
67 |
|
68 |
The model was trained on ["The Piles" dataset prepared by Eleuther.AI](https://pile.eleuther.ai/).
|