Update README.md
Browse files
README.md
CHANGED
@@ -156,7 +156,7 @@ print(response)
|
|
156 |
|
157 |
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.01.framework```) and calls the Python script ``call_server.py``. The Bash script ``nemo_inference.sh`` is as follows,
|
158 |
|
159 |
-
```
|
160 |
NEMO_FILE=$1
|
161 |
WEB_PORT=1424
|
162 |
|
@@ -174,7 +174,6 @@ depends_on () {
|
|
174 |
}
|
175 |
|
176 |
|
177 |
-
|
178 |
/usr/bin/python3 /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_eval.py \
|
179 |
gpt_model_file=$NEMO_FILE \
|
180 |
pipeline_model_parallel_split_rank=0 \
|
@@ -210,7 +209,7 @@ depends_on () {
|
|
210 |
#!/bin/bash
|
211 |
#SBATCH -A SLURM-ACCOUNT
|
212 |
#SBATCH -p SLURM-PARITION
|
213 |
-
#SBATCH -N 2
|
214 |
#SBATCH -J generation
|
215 |
#SBATCH --ntasks-per-node=8
|
216 |
#SBATCH --gpus-per-node=8
|
@@ -220,8 +219,9 @@ RESULTS=<PATH_TO_YOUR_SCRIPTS_FOLDER>
|
|
220 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
221 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
222 |
MODEL=<PATH_TO>/Nemotron-4-340B-Instruct
|
223 |
-
|
224 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
|
|
225 |
read -r -d '' cmd <<EOF
|
226 |
bash /scripts/nemo_inference.sh /model
|
227 |
EOF
|
|
|
156 |
|
157 |
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.01.framework```) and calls the Python script ``call_server.py``. The Bash script ``nemo_inference.sh`` is as follows,
|
158 |
|
159 |
+
```bash
|
160 |
NEMO_FILE=$1
|
161 |
WEB_PORT=1424
|
162 |
|
|
|
174 |
}
|
175 |
|
176 |
|
|
|
177 |
/usr/bin/python3 /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_eval.py \
|
178 |
gpt_model_file=$NEMO_FILE \
|
179 |
pipeline_model_parallel_split_rank=0 \
|
|
|
209 |
#!/bin/bash
|
210 |
#SBATCH -A SLURM-ACCOUNT
|
211 |
#SBATCH -p SLURM-PARITION
|
212 |
+
#SBATCH -N 2
|
213 |
#SBATCH -J generation
|
214 |
#SBATCH --ntasks-per-node=8
|
215 |
#SBATCH --gpus-per-node=8
|
|
|
219 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
220 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
221 |
MODEL=<PATH_TO>/Nemotron-4-340B-Instruct
|
222 |
+
CONTAINER="nvcr.io/nvidia/nemo:24.01.framework"
|
223 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
224 |
+
|
225 |
read -r -d '' cmd <<EOF
|
226 |
bash /scripts/nemo_inference.sh /model
|
227 |
EOF
|