Myashka commited on
Commit
334af1d
1 Parent(s): 7006e61

End of training

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -13,6 +13,16 @@ should probably proofread and complete it, then remove this comment. -->
13
  # gpt-imdb-ipo_annealing
14
 
15
  This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Model description
18
 
@@ -40,6 +50,26 @@ The following hyperparameters were used during training:
40
  - lr_scheduler_warmup_steps: 150
41
  - training_steps: 7197
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ### Framework versions
44
 
45
  - Transformers 4.35.2
 
13
  # gpt-imdb-ipo_annealing
14
 
15
  This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
16
+ It achieves the following results on the evaluation set:
17
+ - Loss: 125.6974
18
+ - Rewards/chosen: -0.0343
19
+ - Rewards/rejected: -0.1277
20
+ - Rewards/accuracies: 0.875
21
+ - Rewards/margins: 0.0934
22
+ - Logps/rejected: -267.1282
23
+ - Logps/chosen: -236.1897
24
+ - Logits/rejected: -31.3501
25
+ - Logits/chosen: -31.5916
26
 
27
  ## Model description
28
 
 
50
  - lr_scheduler_warmup_steps: 150
51
  - training_steps: 7197
52
 
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
56
+ |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
57
+ | 16.3187 | 0.21 | 500 | 34.0876 | 0.1161 | -0.1126 | 0.5292 | 0.2287 | -263.8062 | -235.1407 | -33.1877 | -33.4371 |
58
+ | 5.5155 | 0.42 | 1000 | 13.0423 | -0.1485 | -0.3812 | 0.5042 | 0.2327 | -264.1273 | -235.4375 | -35.2608 | -35.4541 |
59
+ | 10.2532 | 0.63 | 1500 | 18.5157 | -0.4407 | -0.5471 | 0.5458 | 0.1064 | -264.3746 | -235.8205 | -34.2230 | -34.4246 |
60
+ | 6.755 | 0.83 | 2000 | 28.1593 | -0.7791 | -0.8052 | 0.5917 | 0.0261 | -264.7961 | -236.3400 | -33.6119 | -33.8069 |
61
+ | 9.4126 | 1.04 | 2500 | 9.2406 | -0.8733 | -1.2564 | 0.6229 | 0.3831 | -265.6003 | -236.5962 | -31.9471 | -32.0700 |
62
+ | 8.5908 | 1.25 | 3000 | 12.4967 | -0.6700 | -1.0163 | 0.6167 | 0.3462 | -265.4156 | -236.4061 | -31.6914 | -31.8443 |
63
+ | 19.5217 | 1.46 | 3500 | 6.8889 | -0.0720 | -0.4689 | 0.6854 | 0.3969 | -264.5895 | -235.4041 | -32.1300 | -32.2692 |
64
+ | 6.9195 | 1.67 | 4000 | 4.2435 | -0.5324 | -0.9335 | 0.7021 | 0.4012 | -265.7609 | -236.4489 | -31.8342 | -31.9606 |
65
+ | 4.6993 | 1.88 | 4500 | 5.0987 | -0.2002 | -0.6179 | 0.7521 | 0.4177 | -265.3070 | -235.7907 | -31.6301 | -31.7617 |
66
+ | 2.7896 | 2.08 | 5000 | 2.7344 | -0.2390 | -0.5589 | 0.7500 | 0.3199 | -265.4754 | -236.0307 | -31.9650 | -32.1009 |
67
+ | 3.2262 | 2.29 | 5500 | 3.0584 | -0.1936 | -0.5168 | 0.8083 | 0.3231 | -265.8080 | -236.0606 | -31.6585 | -31.8243 |
68
+ | 4.1965 | 2.5 | 6000 | 4.2350 | -0.1555 | -0.4440 | 0.8417 | 0.2884 | -266.2272 | -236.1557 | -31.6484 | -31.8344 |
69
+ | 15.1482 | 2.71 | 6500 | 10.8174 | -0.0932 | -0.3244 | 0.8667 | 0.2312 | -266.7491 | -236.1454 | -31.4600 | -31.6800 |
70
+ | 145.9251 | 2.92 | 7000 | 125.6974 | -0.0343 | -0.1277 | 0.875 | 0.0934 | -267.1282 | -236.1897 | -31.3501 | -31.5916 |
71
+
72
+
73
  ### Framework versions
74
 
75
  - Transformers 4.35.2