Update README.md
Browse files
README.md
CHANGED
@@ -14,4 +14,56 @@ Write a story using this writing prompt: As a prank a witch detached your cock a
|
|
14 |
|
15 |
Apparently RP has also become a bit less sloppy by coincidence.
|
16 |
|
17 |
-
We are looking into opening the datasets up, I'm a bit tired atm, you can also just go get this torrent of the entire reddit, select only the subreddits you want and DIY [https://academictorrents.com/details/56aa49f9653ba545f48df2e33679f014d2829c10]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
Apparently RP has also become a bit less sloppy by coincidence.
|
16 |
|
17 |
+
We are looking into opening the datasets up, I'm a bit tired atm, you can also just go get this torrent of the entire reddit, select only the subreddits you want and DIY [https://academictorrents.com/details/56aa49f9653ba545f48df2e33679f014d2829c10]
|
18 |
+
|
19 |
+
(for context - this model was a test run, on a small dataset. It will be scaled up later.)
|
20 |
+
|
21 |
+
|
22 |
+
## Training Config:
|
23 |
+
|
24 |
+
Thanks a lot to llamafactory, the easiest train I've ever done so far.
|
25 |
+
|
26 |
+
```
|
27 |
+
llamafactory-cli train \
|
28 |
+
--stage kto \
|
29 |
+
--do_train True \
|
30 |
+
--model_name_or_path cognitivecomputations/dolphin-2.9.1-llama-3-8b \
|
31 |
+
--preprocessing_num_workers 16 \
|
32 |
+
--finetuning_type lora \
|
33 |
+
--quantization_bit 8 \
|
34 |
+
--template chatml \
|
35 |
+
--flash_attn auto \
|
36 |
+
--use_unsloth True \
|
37 |
+
--dataset_dir /workspace/kto \
|
38 |
+
--dataset kto_dataset \
|
39 |
+
--cutoff_len 2048 \
|
40 |
+
--learning_rate 5e-05 \
|
41 |
+
--num_train_epochs 3.0 \
|
42 |
+
--max_samples 100000 \
|
43 |
+
--per_device_train_batch_size 2 \
|
44 |
+
--gradient_accumulation_steps 8 \
|
45 |
+
--lr_scheduler_type cosine \
|
46 |
+
--max_grad_norm 1.0 \
|
47 |
+
--logging_steps 5 \
|
48 |
+
--save_steps 500 \
|
49 |
+
--warmup_steps 50 \
|
50 |
+
--optim adamw_torch \
|
51 |
+
--packing False \
|
52 |
+
--report_to all \
|
53 |
+
--output_dir saves/LLaMA3-8B/lora/train_2024-06-15-15-18-25 \
|
54 |
+
--bf16 True \
|
55 |
+
--plot_loss True \
|
56 |
+
--ddp_timeout 180000000 \
|
57 |
+
--include_num_input_tokens_seen True \
|
58 |
+
--lora_rank 32 \
|
59 |
+
--lora_alpha 32 \
|
60 |
+
--lora_dropout 0 \
|
61 |
+
--lora_target all \
|
62 |
+
--pref_beta 0.1 \
|
63 |
+
--pref_ftx 0 \
|
64 |
+
--pref_loss sigmoid \
|
65 |
+
--val_size 0.05 \
|
66 |
+
--eval_strategy steps \
|
67 |
+
--eval_steps 50 \
|
68 |
+
--per_device_eval_batch_size 2
|
69 |
+
```
|