Kleo commited on
Commit
24ce2de
·
verified ·
1 Parent(s): 0c0c772

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -13
README.md CHANGED
@@ -145,7 +145,7 @@ Machine translated train set of [ArgKP_2021_GR](https://huggingface.co/datasets/
145
  |LoRA r | 20 |
146
  |LoRA alpha | 9 |
147
  |LoRA dropout |0.0 |
148
- |LoRA bias |‘none' |
149
  |target_modules |q_proj, v_proj |
150
  |task_type |"SEQ_CLS" |
151
  |Loss |BCE |
@@ -153,20 +153,23 @@ Machine translated train set of [ArgKP_2021_GR](https://huggingface.co/datasets/
153
 
154
  ### Training Procedure
155
  The following hyperparameters were used during training:
156
- learning_rate: 1e-4
157
- train_batch_size: 16
158
- eval_batch_size: 16
159
- seed: 42
160
- num_devices: 1
161
- gradient_accumulation_steps: 2
162
- optimizer: paged Adam optimizer
163
- lr_scheduler_type: linear
164
- Weight Decay: 0.01
165
- M. G. Norm: 0.3
166
- max_seq_length: 512
167
- num_epochs: 1
168
 
169
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
 
171
  #### Training hyperparameters
172
 
 
145
  |LoRA r | 20 |
146
  |LoRA alpha | 9 |
147
  |LoRA dropout |0.0 |
148
+ |LoRA bias |'none' |
149
  |target_modules |q_proj, v_proj |
150
  |task_type |"SEQ_CLS" |
151
  |Loss |BCE |
 
153
 
154
  ### Training Procedure
155
  The following hyperparameters were used during training:
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
 
158
+ |Hyperparameter | Value |
159
+ |----------------------------|-------------------------------------|
160
+ |l_r | 1e-4 |
161
+ |lr_scheduler_type |linear |
162
+ |train_batch_size | 16 |
163
+ |eval_batch_size |16 |
164
+ |seed |42 |
165
+ |num_devices |1 |
166
+ |gradient_accumulation_steps |2 |
167
+ |optimizer |paged Adam |
168
+ |Weight Decay | 0.01 |
169
+ |max grad norm | 0.3 |
170
+ |max_seq_length |512 |
171
+ |num_epochs |1 |
172
+
173
 
174
  #### Training hyperparameters
175