nvidia
/

Llama2-13B-SteerLM-RM

Text Generation

Model card Files Files and versions Community

zhilinw commited on Feb 20, 2024

Commit

69955e1

·

verified ·

1 Parent(s): 9ba1a9e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -64,7 +64,7 @@ python /opt/NeMo-Aligner/examples/nlp/gpt/serve_reward_model.py \
       inference.port=1424
 ```
-2. Annotate data files using the served reward model. If you are seeking to reproduce training of [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat), this will be the Open Assistant train/val files.
 ```python
 python /opt/NeMo-Aligner/examples/nlp/data/steerlm/preprocess_openassistant_data.py --output_directory=data/oasst

       inference.port=1424
 ```
+2. Annotate data files using the served reward model. If you are seeking to reproduce training of [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat), this will be the Open Assistant train/val files. Then follow the next step to train a SteerLM model based on [SteerLM training user guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/steerlm.html#step-5-train-the-attribute-conditioned-sft-model) .
 ```python
 python /opt/NeMo-Aligner/examples/nlp/data/steerlm/preprocess_openassistant_data.py --output_directory=data/oasst