davidhornshaw
commited on
Commit
•
953bad0
1
Parent(s):
ca3d4dd
Added more detail to training hyperparams
Browse files
README.md
CHANGED
@@ -69,11 +69,20 @@ It is a dataset is designed for ORPO or DPO training. See Fine-tune Llama 3 with
|
|
69 |
|
70 |
### Training Procedure
|
71 |
|
72 |
-
We used the trl [ORPO trainer](https://huggingface.co/docs/trl/main/en/orpo_trainer) for finetuning
|
|
|
73 |
|
74 |
### Training Hyperparameters
|
75 |
|
76 |
- **Training regime:** fp16 non-mixed precision
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
|
78 |
# Evaluation
|
79 |
|
|
|
69 |
|
70 |
### Training Procedure
|
71 |
|
72 |
+
We used the trl [ORPO trainer](https://huggingface.co/docs/trl/main/en/orpo_trainer) for finetuning over four epochs with batch size two.
|
73 |
+
Moreover, we used [LoRa](https://arxiv.org/abs/2106.09685) for parameter efficient training by targeting only particular parts of the base model architecture.
|
74 |
|
75 |
### Training Hyperparameters
|
76 |
|
77 |
- **Training regime:** fp16 non-mixed precision
|
78 |
+
- **Max lenght:** 4096
|
79 |
+
- **Max prompt length:** 4096
|
80 |
+
- **Batch size:** 2
|
81 |
+
- **Epochs trained:** 4
|
82 |
+
- **Modules targeted:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
83 |
+
- **Bias:** None
|
84 |
+
|
85 |
+
All remaining hyperparameters were kept standard.
|
86 |
|
87 |
# Evaluation
|
88 |
|