Text Generation
NeMo
English
nvidia
steerlm
llama2
reward model
zhilinw commited on
Commit
69955e1
1 Parent(s): 9ba1a9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -64,7 +64,7 @@ python /opt/NeMo-Aligner/examples/nlp/gpt/serve_reward_model.py \
64
  inference.port=1424
65
  ```
66
 
67
- 2. Annotate data files using the served reward model. If you are seeking to reproduce training of [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat), this will be the Open Assistant train/val files.
68
 
69
  ```python
70
  python /opt/NeMo-Aligner/examples/nlp/data/steerlm/preprocess_openassistant_data.py --output_directory=data/oasst
 
64
  inference.port=1424
65
  ```
66
 
67
+ 2. Annotate data files using the served reward model. If you are seeking to reproduce training of [Llama2-70B-SteerLM-Chat](https://huggingface.co/nvidia/Llama2-70B-SteerLM-Chat), this will be the Open Assistant train/val files. Then follow the next step to train a SteerLM model based on [SteerLM training user guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/modelalignment/steerlm.html#step-5-train-the-attribute-conditioned-sft-model) .
68
 
69
  ```python
70
  python /opt/NeMo-Aligner/examples/nlp/data/steerlm/preprocess_openassistant_data.py --output_directory=data/oasst