Update README.md
Browse files
README.md
CHANGED
@@ -67,6 +67,21 @@ ollama pull Tohur/natsumura-storytelling-rp-llama-3.1
|
|
67 |
- tdh87/Just-stories
|
68 |
- tdh87/Just-stories-2
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
## Inference
|
71 |
|
72 |
I use the following settings for inference:
|
|
|
67 |
- tdh87/Just-stories
|
68 |
- tdh87/Just-stories-2
|
69 |
|
70 |
+
The following parameters were used in [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) during training:
|
71 |
+
- per_device_train_batch_size=2
|
72 |
+
- gradient_accumulation_steps=4
|
73 |
+
- lr_scheduler_type="cosine"
|
74 |
+
- logging_steps=10
|
75 |
+
- warmup_ratio=0.1
|
76 |
+
- save_steps=1000
|
77 |
+
- learning_rate=2e-5
|
78 |
+
- num_train_epochs=3.0
|
79 |
+
- max_samples=500
|
80 |
+
- max_grad_norm=1.0
|
81 |
+
- quantization_bit=4
|
82 |
+
- loraplus_lr_ratio=16.0
|
83 |
+
- fp16=True
|
84 |
+
|
85 |
## Inference
|
86 |
|
87 |
I use the following settings for inference:
|