aruca commited on
Commit
764e7ff
1 Parent(s): 89618ab

End of training

Browse files
README.md CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [google/pegasus-x-base](https://huggingface.co/google/pegasus-x-base) on the None dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 1.8689
18
 
19
  ## Model description
20
 
@@ -33,133 +33,76 @@ More information needed
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
- - learning_rate: 5e-05
37
  - train_batch_size: 1
38
  - eval_batch_size: 1
39
  - seed: 42
40
  - gradient_accumulation_steps: 16
41
  - total_train_batch_size: 16
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
- - lr_scheduler_type: linear
44
- - lr_scheduler_warmup_ratio: 0.2
45
- - num_epochs: 35
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
- | 5.5063 | 0.31 | 10 | 4.8592 |
52
- | 5.2167 | 0.61 | 20 | 4.4484 |
53
- | 4.7319 | 0.92 | 30 | 4.2347 |
54
- | 4.548 | 1.23 | 40 | 4.1020 |
55
- | 4.4557 | 1.54 | 50 | 3.8821 |
56
- | 4.2263 | 1.84 | 60 | 3.6951 |
57
- | 4.027 | 2.15 | 70 | 3.4991 |
58
- | 3.7787 | 2.46 | 80 | 3.3147 |
59
- | 3.7009 | 2.76 | 90 | 3.1642 |
60
- | 3.4021 | 3.07 | 100 | 3.0492 |
61
- | 3.3387 | 3.38 | 110 | 2.9474 |
62
- | 3.2382 | 3.69 | 120 | 2.8471 |
63
- | 3.1799 | 3.99 | 130 | 2.7752 |
64
- | 3.0276 | 4.3 | 140 | 2.6837 |
65
- | 2.8994 | 4.61 | 150 | 2.5888 |
66
- | 2.7174 | 4.91 | 160 | 2.5244 |
67
- | 2.7552 | 5.22 | 170 | 2.4612 |
68
- | 2.5894 | 5.53 | 180 | 2.4003 |
69
- | 2.5998 | 5.83 | 190 | 2.3504 |
70
- | 2.3702 | 6.14 | 200 | 2.3181 |
71
- | 2.4051 | 6.45 | 210 | 2.2775 |
72
- | 2.3176 | 6.76 | 220 | 2.2385 |
73
- | 2.2954 | 7.06 | 230 | 2.2059 |
74
- | 2.2362 | 7.37 | 240 | 2.1733 |
75
- | 2.1686 | 7.68 | 250 | 2.1486 |
76
- | 2.1477 | 7.98 | 260 | 2.1336 |
77
- | 2.0672 | 8.29 | 270 | 2.1180 |
78
- | 2.0703 | 8.6 | 280 | 2.0947 |
79
- | 2.0453 | 8.91 | 290 | 2.0771 |
80
- | 1.9572 | 9.21 | 300 | 2.0595 |
81
- | 2.017 | 9.52 | 310 | 2.0392 |
82
- | 1.9211 | 9.83 | 320 | 2.0319 |
83
- | 1.9451 | 10.13 | 330 | 2.0261 |
84
- | 1.8032 | 10.44 | 340 | 2.0047 |
85
- | 1.87 | 10.75 | 350 | 1.9914 |
86
- | 1.8463 | 11.06 | 360 | 1.9857 |
87
- | 1.7886 | 11.36 | 370 | 1.9775 |
88
- | 1.795 | 11.67 | 380 | 1.9719 |
89
- | 1.7806 | 11.98 | 390 | 1.9593 |
90
- | 1.7128 | 12.28 | 400 | 1.9562 |
91
- | 1.7972 | 12.59 | 410 | 1.9469 |
92
- | 1.7145 | 12.9 | 420 | 1.9444 |
93
- | 1.6469 | 13.21 | 430 | 1.9379 |
94
- | 1.6306 | 13.51 | 440 | 1.9423 |
95
- | 1.7196 | 13.82 | 450 | 1.9258 |
96
- | 1.6259 | 14.13 | 460 | 1.9311 |
97
- | 1.6751 | 14.43 | 470 | 1.9217 |
98
- | 1.5793 | 14.74 | 480 | 1.9248 |
99
- | 1.6403 | 15.05 | 490 | 1.9135 |
100
- | 1.5633 | 15.36 | 500 | 1.9136 |
101
- | 1.5209 | 15.66 | 510 | 1.9165 |
102
- | 1.6686 | 15.97 | 520 | 1.9035 |
103
- | 1.5077 | 16.28 | 530 | 1.9085 |
104
- | 1.5747 | 16.58 | 540 | 1.9034 |
105
- | 1.5467 | 16.89 | 550 | 1.8961 |
106
- | 1.4689 | 17.2 | 560 | 1.9046 |
107
- | 1.5466 | 17.5 | 570 | 1.8901 |
108
- | 1.4839 | 17.81 | 580 | 1.8954 |
109
- | 1.4831 | 18.12 | 590 | 1.8928 |
110
- | 1.461 | 18.43 | 600 | 1.8942 |
111
- | 1.5342 | 18.73 | 610 | 1.8844 |
112
- | 1.4588 | 19.04 | 620 | 1.8866 |
113
- | 1.5096 | 19.35 | 630 | 1.8856 |
114
- | 1.4514 | 19.65 | 640 | 1.8853 |
115
- | 1.3916 | 19.96 | 650 | 1.8857 |
116
- | 1.4348 | 20.27 | 660 | 1.8841 |
117
- | 1.4175 | 20.58 | 670 | 1.8826 |
118
- | 1.4346 | 20.88 | 680 | 1.8786 |
119
- | 1.4478 | 21.19 | 690 | 1.8785 |
120
- | 1.3651 | 21.5 | 700 | 1.8772 |
121
- | 1.4411 | 21.8 | 710 | 1.8781 |
122
- | 1.4106 | 22.11 | 720 | 1.8759 |
123
- | 1.3201 | 22.42 | 730 | 1.8727 |
124
- | 1.4129 | 22.73 | 740 | 1.8710 |
125
- | 1.4052 | 23.03 | 750 | 1.8767 |
126
- | 1.3455 | 23.34 | 760 | 1.8753 |
127
- | 1.3503 | 23.65 | 770 | 1.8793 |
128
- | 1.3773 | 23.95 | 780 | 1.8723 |
129
- | 1.3728 | 24.26 | 790 | 1.8731 |
130
- | 1.3508 | 24.57 | 800 | 1.8718 |
131
- | 1.3377 | 24.88 | 810 | 1.8710 |
132
- | 1.3541 | 25.18 | 820 | 1.8755 |
133
- | 1.3484 | 25.49 | 830 | 1.8741 |
134
- | 1.3207 | 25.8 | 840 | 1.8721 |
135
- | 1.293 | 26.1 | 850 | 1.8745 |
136
- | 1.3141 | 26.41 | 860 | 1.8747 |
137
- | 1.3211 | 26.72 | 870 | 1.8720 |
138
- | 1.34 | 27.02 | 880 | 1.8689 |
139
- | 1.354 | 27.33 | 890 | 1.8739 |
140
- | 1.2974 | 27.64 | 900 | 1.8699 |
141
- | 1.2708 | 27.95 | 910 | 1.8725 |
142
- | 1.267 | 28.25 | 920 | 1.8727 |
143
- | 1.3288 | 28.56 | 930 | 1.8725 |
144
- | 1.3287 | 28.87 | 940 | 1.8703 |
145
- | 1.2591 | 29.17 | 950 | 1.8736 |
146
- | 1.2843 | 29.48 | 960 | 1.8765 |
147
- | 1.3456 | 29.79 | 970 | 1.8684 |
148
- | 1.3054 | 30.1 | 980 | 1.8651 |
149
- | 1.1838 | 30.4 | 990 | 1.8703 |
150
- | 1.2867 | 30.71 | 1000 | 1.8729 |
151
- | 1.3736 | 31.02 | 1010 | 1.8684 |
152
- | 1.3065 | 31.32 | 1020 | 1.8667 |
153
- | 1.2771 | 31.63 | 1030 | 1.8713 |
154
- | 1.2562 | 31.94 | 1040 | 1.8730 |
155
- | 1.3123 | 32.25 | 1050 | 1.8712 |
156
- | 1.2836 | 32.55 | 1060 | 1.8692 |
157
- | 1.271 | 32.86 | 1070 | 1.8687 |
158
- | 1.269 | 33.17 | 1080 | 1.8685 |
159
- | 1.2472 | 33.47 | 1090 | 1.8681 |
160
- | 1.2518 | 33.78 | 1100 | 1.8687 |
161
- | 1.3026 | 34.09 | 1110 | 1.8689 |
162
- | 1.3021 | 34.4 | 1120 | 1.8689 |
163
 
164
 
165
  ### Framework versions
 
14
 
15
  This model is a fine-tuned version of [google/pegasus-x-base](https://huggingface.co/google/pegasus-x-base) on the None dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 1.2446
18
 
19
  ## Model description
20
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
+ - learning_rate: 0.0001
37
  - train_batch_size: 1
38
  - eval_batch_size: 1
39
  - seed: 42
40
  - gradient_accumulation_steps: 16
41
  - total_train_batch_size: 16
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
+ - lr_scheduler_type: reduce_lr_on_plateau
44
+ - num_epochs: 3
 
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | 4.0461 | 0.05 | 10 | 2.6611 |
51
+ | 2.8656 | 0.11 | 20 | 2.3313 |
52
+ | 2.5122 | 0.16 | 30 | 2.1502 |
53
+ | 2.3941 | 0.21 | 40 | 2.0071 |
54
+ | 2.2445 | 0.27 | 50 | 1.9110 |
55
+ | 2.205 | 0.32 | 60 | 1.8388 |
56
+ | 2.1341 | 0.37 | 70 | 1.7728 |
57
+ | 1.9793 | 0.43 | 80 | 1.7464 |
58
+ | 1.8616 | 0.48 | 90 | 1.6930 |
59
+ | 1.8848 | 0.53 | 100 | 1.6589 |
60
+ | 1.8432 | 0.59 | 110 | 1.6232 |
61
+ | 1.7926 | 0.64 | 120 | 1.5996 |
62
+ | 1.7956 | 0.69 | 130 | 1.5898 |
63
+ | 1.7327 | 0.75 | 140 | 1.5683 |
64
+ | 1.8604 | 0.8 | 150 | 1.5416 |
65
+ | 1.7607 | 0.85 | 160 | 1.5211 |
66
+ | 1.7807 | 0.91 | 170 | 1.5025 |
67
+ | 1.6985 | 0.96 | 180 | 1.4906 |
68
+ | 1.6284 | 1.01 | 190 | 1.4781 |
69
+ | 1.5689 | 1.07 | 200 | 1.4680 |
70
+ | 1.4443 | 1.12 | 210 | 1.4602 |
71
+ | 1.564 | 1.17 | 220 | 1.4439 |
72
+ | 1.4824 | 1.23 | 230 | 1.4327 |
73
+ | 1.4463 | 1.28 | 240 | 1.4247 |
74
+ | 1.5279 | 1.33 | 250 | 1.4195 |
75
+ | 1.4522 | 1.39 | 260 | 1.3928 |
76
+ | 1.5307 | 1.44 | 270 | 1.3943 |
77
+ | 1.4977 | 1.49 | 280 | 1.3779 |
78
+ | 1.5163 | 1.55 | 290 | 1.3756 |
79
+ | 1.4912 | 1.6 | 300 | 1.3558 |
80
+ | 1.5212 | 1.65 | 310 | 1.3539 |
81
+ | 1.4575 | 1.71 | 320 | 1.3424 |
82
+ | 1.3196 | 1.76 | 330 | 1.3386 |
83
+ | 1.3492 | 1.81 | 340 | 1.3257 |
84
+ | 1.4383 | 1.87 | 350 | 1.3248 |
85
+ | 1.4726 | 1.92 | 360 | 1.3168 |
86
+ | 1.3496 | 1.97 | 370 | 1.3117 |
87
+ | 1.2985 | 2.03 | 380 | 1.3181 |
88
+ | 1.1527 | 2.08 | 390 | 1.3094 |
89
+ | 1.2825 | 2.13 | 400 | 1.3112 |
90
+ | 1.2893 | 2.19 | 410 | 1.2984 |
91
+ | 1.2076 | 2.24 | 420 | 1.2880 |
92
+ | 1.3257 | 2.29 | 430 | 1.2904 |
93
+ | 1.3425 | 2.35 | 440 | 1.2756 |
94
+ | 1.2814 | 2.4 | 450 | 1.2806 |
95
+ | 1.3054 | 2.45 | 460 | 1.2782 |
96
+ | 1.1984 | 2.51 | 470 | 1.2767 |
97
+ | 1.2381 | 2.56 | 480 | 1.2653 |
98
+ | 1.1786 | 2.61 | 490 | 1.2712 |
99
+ | 1.1959 | 2.67 | 500 | 1.2534 |
100
+ | 1.2749 | 2.72 | 510 | 1.2548 |
101
+ | 1.2894 | 2.77 | 520 | 1.2531 |
102
+ | 1.2131 | 2.83 | 530 | 1.2561 |
103
+ | 1.226 | 2.88 | 540 | 1.2459 |
104
+ | 1.1534 | 2.93 | 550 | 1.2466 |
105
+ | 1.2492 | 2.99 | 560 | 1.2446 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
 
107
 
108
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a99f4bb99c472cda1b3df9a39cb4464397c0ae5836d66e01d99c9c14e659f589
3
  size 1089213696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc8e447884be60e77149a73101e5f9b3088f3ed87e01898b51acc35cc8bbc921
3
  size 1089213696
runs/Feb20_18-52-00_7d984f560b48/events.out.tfevents.1708455121.7d984f560b48.508.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d587cb21b341f1b839d4f1244d3ee08b921a7049961c46a5def8e12e0cae208
3
+ size 29477
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8bc135d6b39a7b6c4883d89fa372bfc43b3121ccc961cceea0490abab8d80844
3
  size 4728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:677a69c908eb2c08c8f550291b2d60f3196db02f79af837933ae8b09bbb85adc
3
  size 4728