jondurbin commited on
Commit
ea9bbc1
1 Parent(s): c6851de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -4,7 +4,7 @@ license: other
4
 
5
  # Overview
6
 
7
- This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated by my tool [airobors](https://github.com/jondurbin/airoboros)
8
 
9
  I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
10
 
@@ -15,5 +15,40 @@ I'm still combing through the data a bit to make sure there's nothing blatantly
15
  The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
16
  (https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ### License
19
  The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.
 
4
 
5
  # Overview
6
 
7
+ This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated [airoboros](https://github.com/jondurbin/airoboros)
8
 
9
  I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
10
 
 
15
  The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
16
  (https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
17
 
18
+ ### Fine-tuning method
19
+
20
+ I used the excellent [FastChat](https://github.com/lm-sys/FastChat) module, running with:
21
+
22
+ ```
23
+ torchrun --nproc_per_node=8 --master_port=20001 /workspace/FastChat/fastchat/train/train_mem.py \
24
+ --model_name_or_path /workspace/llama-7b \
25
+ --data_path /workspace/as_conversations.json \
26
+ --bf16 True \
27
+ --output_dir /workspace/airoboros-uncensored-7b \
28
+ --num_train_epochs 3 \
29
+ --per_device_train_batch_size 24 \
30
+ --per_device_eval_batch_size 24 \
31
+ --gradient_accumulation_steps 2 \
32
+ --evaluation_strategy "steps" \
33
+ --eval_steps 1000 \
34
+ --save_strategy "steps" \
35
+ --save_steps 1000 \
36
+ --save_total_limit 10 \
37
+ --learning_rate 2e-5 \
38
+ --weight_decay 0. \
39
+ --warmup_ratio 0.04 \
40
+ --lr_scheduler_type "cosine" \
41
+ --logging_steps 1 \
42
+ --fsdp "full_shard auto_wrap" \
43
+ --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
44
+ --tf32 True \
45
+ --model_max_length 2048 \
46
+ --gradient_checkpointing True \
47
+ --lazy_preprocess True
48
+ ```
49
+
50
+ This ran on 8x nvidia 80gb a100's for about 17 hours.
51
+
52
+
53
  ### License
54
  The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.