Text Generation
Safetensors
English
qwen2
davidhornshaw commited on
Commit
953bad0
1 Parent(s): ca3d4dd

Added more detail to training hyperparams

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -69,11 +69,20 @@ It is a dataset is designed for ORPO or DPO training. See Fine-tune Llama 3 with
69
 
70
  ### Training Procedure
71
 
72
- We used the trl [ORPO trainer](https://huggingface.co/docs/trl/main/en/orpo_trainer) for finetuning, together with [LoRa](https://arxiv.org/abs/2106.09685) for speed-up.
 
73
 
74
  ### Training Hyperparameters
75
 
76
  - **Training regime:** fp16 non-mixed precision
 
 
 
 
 
 
 
 
77
 
78
  # Evaluation
79
 
 
69
 
70
  ### Training Procedure
71
 
72
+ We used the trl [ORPO trainer](https://huggingface.co/docs/trl/main/en/orpo_trainer) for finetuning over four epochs with batch size two.
73
+ Moreover, we used [LoRa](https://arxiv.org/abs/2106.09685) for parameter efficient training by targeting only particular parts of the base model architecture.
74
 
75
  ### Training Hyperparameters
76
 
77
  - **Training regime:** fp16 non-mixed precision
78
+ - **Max lenght:** 4096
79
+ - **Max prompt length:** 4096
80
+ - **Batch size:** 2
81
+ - **Epochs trained:** 4
82
+ - **Modules targeted:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
83
+ - **Bias:** None
84
+
85
+ All remaining hyperparameters were kept standard.
86
 
87
  # Evaluation
88