Update README.md
Browse files
README.md
CHANGED
@@ -188,13 +188,17 @@ If you *really* want to use `<|im_start|>` and `<|im_end|>`, just update your `t
|
|
188 |
|
189 |
An example for mistral-7b:
|
190 |
|
|
|
|
|
|
|
|
|
191 |
```bash
|
192 |
export BASE_DIR=/workspace
|
193 |
export WANDB_API_KEY=[redacted]
|
194 |
export WANDB_PROJECT=bagel-7b-v0.1
|
195 |
|
196 |
# Run the pretraining.
|
197 |
-
accelerate launch
|
198 |
--model_name_or_path $BASE_DIR/mistral-7b \
|
199 |
--final_output_dir $BASE_DIR/$WANDB_PROJECT \
|
200 |
--output_dir $BASE_DIR/$WANDB_PROJECT-workdir \
|
@@ -266,7 +270,7 @@ export BASE_DIR=/mnt/data
|
|
266 |
export WANDB_API_KEY=[redacted]
|
267 |
export WANDB_PROJECT=bagel-dpo-7b-v0.1
|
268 |
|
269 |
-
accelerate launch
|
270 |
--model_name_or_path bagel-7b-v0.1 \
|
271 |
--learning_rate 3e-7 \
|
272 |
--per_device_train_batch_size 2 \
|
|
|
188 |
|
189 |
An example for mistral-7b:
|
190 |
|
191 |
+
*Note: I actually used my fork of [qlora](https://github.com/jondurbin/qlora)'s `train.py` for this, but I'm porting it to a minified version here, not tested yet!*
|
192 |
+
|
193 |
+
*More notes: I stopped the SFT phase around 50% because of budget constraints.*
|
194 |
+
|
195 |
```bash
|
196 |
export BASE_DIR=/workspace
|
197 |
export WANDB_API_KEY=[redacted]
|
198 |
export WANDB_PROJECT=bagel-7b-v0.1
|
199 |
|
200 |
# Run the pretraining.
|
201 |
+
accelerate launch bagel/tune/sft.py \
|
202 |
--model_name_or_path $BASE_DIR/mistral-7b \
|
203 |
--final_output_dir $BASE_DIR/$WANDB_PROJECT \
|
204 |
--output_dir $BASE_DIR/$WANDB_PROJECT-workdir \
|
|
|
270 |
export WANDB_API_KEY=[redacted]
|
271 |
export WANDB_PROJECT=bagel-dpo-7b-v0.1
|
272 |
|
273 |
+
accelerate launch bagel/tune/dpo.py \
|
274 |
--model_name_or_path bagel-7b-v0.1 \
|
275 |
--learning_rate 3e-7 \
|
276 |
--per_device_train_batch_size 2 \
|