jondurbin commited on
Commit
6bcabff
1 Parent(s): 02ba36c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -150,7 +150,9 @@ If you *really* want to use `<|im_start|>` and `<|im_end|>`, just update your `t
150
  {instruction} [/INST]
151
  ```
152
 
153
- ## Fine tune
 
 
154
 
155
  ```bash
156
  export BASE_DIR=/workspace
@@ -158,7 +160,7 @@ export WANDB_API_KEY=[redacted]
158
  export WANDB_PROJECT=bagel-7b-v0.1
159
 
160
  # Run the pretraining.
161
- accelerate launch -m bagel.tune.sft \
162
  --model_name_or_path $BASE_DIR/mistral-7b \
163
  --final_output_dir $BASE_DIR/$WANDB_PROJECT \
164
  --output_dir $BASE_DIR/$WANDB_PROJECT-workdir \
@@ -219,6 +221,4 @@ Deepspeed configuration:
219
  "allgather_bucket_size": 5e8
220
  }
221
  }
222
- ```
223
-
224
- This was done in runpod on an 8x 80gb a100 instance. I actually stopped the fine tune at around 50% due to budget constraints.
 
150
  {instruction} [/INST]
151
  ```
152
 
153
+ ### Fine-tune
154
+
155
+ *Note: I actually used my fork of [qlora](https://github.com/jondurbin/qlora)'s `train.py` for this, but I'm porting it to a minified version here, not tested yet!*
156
 
157
  ```bash
158
  export BASE_DIR=/workspace
 
160
  export WANDB_PROJECT=bagel-7b-v0.1
161
 
162
  # Run the pretraining.
163
+ accelerate launch bagel/tune/sft.py \
164
  --model_name_or_path $BASE_DIR/mistral-7b \
165
  --final_output_dir $BASE_DIR/$WANDB_PROJECT \
166
  --output_dir $BASE_DIR/$WANDB_PROJECT-workdir \
 
221
  "allgather_bucket_size": 5e8
222
  }
223
  }
224
+ ```