End of training
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
14 |
|
15 |
This model is a fine-tuned version of [google/pegasus-x-base](https://huggingface.co/google/pegasus-x-base) on the None dataset.
|
16 |
It achieves the following results on the evaluation set:
|
17 |
-
- Loss: 1.
|
18 |
|
19 |
## Model description
|
20 |
|
@@ -33,133 +33,76 @@ More information needed
|
|
33 |
### Training hyperparameters
|
34 |
|
35 |
The following hyperparameters were used during training:
|
36 |
-
- learning_rate:
|
37 |
- train_batch_size: 1
|
38 |
- eval_batch_size: 1
|
39 |
- seed: 42
|
40 |
- gradient_accumulation_steps: 16
|
41 |
- total_train_batch_size: 16
|
42 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
43 |
-
- lr_scheduler_type:
|
44 |
-
-
|
45 |
-
- num_epochs: 35
|
46 |
|
47 |
### Training results
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:----:|:---------------:|
|
51 |
-
|
|
52 |
-
|
|
53 |
-
|
|
54 |
-
|
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
65 |
-
|
|
66 |
-
|
|
67 |
-
|
|
68 |
-
|
|
69 |
-
|
|
70 |
-
|
|
71 |
-
|
|
72 |
-
|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
|
|
78 |
-
|
|
79 |
-
|
|
80 |
-
| 1.
|
81 |
-
|
|
82 |
-
| 1.
|
83 |
-
| 1.
|
84 |
-
| 1.
|
85 |
-
| 1.
|
86 |
-
| 1.
|
87 |
-
| 1.
|
88 |
-
| 1.
|
89 |
-
| 1.
|
90 |
-
| 1.
|
91 |
-
| 1.
|
92 |
-
| 1.
|
93 |
-
| 1.
|
94 |
-
| 1.
|
95 |
-
| 1.
|
96 |
-
| 1.
|
97 |
-
| 1.
|
98 |
-
| 1.
|
99 |
-
| 1.
|
100 |
-
| 1.
|
101 |
-
| 1.
|
102 |
-
| 1.
|
103 |
-
| 1.
|
104 |
-
| 1.
|
105 |
-
| 1.
|
106 |
-
| 1.
|
107 |
-
| 1.5466 | 17.5 | 570 | 1.8901 |
|
108 |
-
| 1.4839 | 17.81 | 580 | 1.8954 |
|
109 |
-
| 1.4831 | 18.12 | 590 | 1.8928 |
|
110 |
-
| 1.461 | 18.43 | 600 | 1.8942 |
|
111 |
-
| 1.5342 | 18.73 | 610 | 1.8844 |
|
112 |
-
| 1.4588 | 19.04 | 620 | 1.8866 |
|
113 |
-
| 1.5096 | 19.35 | 630 | 1.8856 |
|
114 |
-
| 1.4514 | 19.65 | 640 | 1.8853 |
|
115 |
-
| 1.3916 | 19.96 | 650 | 1.8857 |
|
116 |
-
| 1.4348 | 20.27 | 660 | 1.8841 |
|
117 |
-
| 1.4175 | 20.58 | 670 | 1.8826 |
|
118 |
-
| 1.4346 | 20.88 | 680 | 1.8786 |
|
119 |
-
| 1.4478 | 21.19 | 690 | 1.8785 |
|
120 |
-
| 1.3651 | 21.5 | 700 | 1.8772 |
|
121 |
-
| 1.4411 | 21.8 | 710 | 1.8781 |
|
122 |
-
| 1.4106 | 22.11 | 720 | 1.8759 |
|
123 |
-
| 1.3201 | 22.42 | 730 | 1.8727 |
|
124 |
-
| 1.4129 | 22.73 | 740 | 1.8710 |
|
125 |
-
| 1.4052 | 23.03 | 750 | 1.8767 |
|
126 |
-
| 1.3455 | 23.34 | 760 | 1.8753 |
|
127 |
-
| 1.3503 | 23.65 | 770 | 1.8793 |
|
128 |
-
| 1.3773 | 23.95 | 780 | 1.8723 |
|
129 |
-
| 1.3728 | 24.26 | 790 | 1.8731 |
|
130 |
-
| 1.3508 | 24.57 | 800 | 1.8718 |
|
131 |
-
| 1.3377 | 24.88 | 810 | 1.8710 |
|
132 |
-
| 1.3541 | 25.18 | 820 | 1.8755 |
|
133 |
-
| 1.3484 | 25.49 | 830 | 1.8741 |
|
134 |
-
| 1.3207 | 25.8 | 840 | 1.8721 |
|
135 |
-
| 1.293 | 26.1 | 850 | 1.8745 |
|
136 |
-
| 1.3141 | 26.41 | 860 | 1.8747 |
|
137 |
-
| 1.3211 | 26.72 | 870 | 1.8720 |
|
138 |
-
| 1.34 | 27.02 | 880 | 1.8689 |
|
139 |
-
| 1.354 | 27.33 | 890 | 1.8739 |
|
140 |
-
| 1.2974 | 27.64 | 900 | 1.8699 |
|
141 |
-
| 1.2708 | 27.95 | 910 | 1.8725 |
|
142 |
-
| 1.267 | 28.25 | 920 | 1.8727 |
|
143 |
-
| 1.3288 | 28.56 | 930 | 1.8725 |
|
144 |
-
| 1.3287 | 28.87 | 940 | 1.8703 |
|
145 |
-
| 1.2591 | 29.17 | 950 | 1.8736 |
|
146 |
-
| 1.2843 | 29.48 | 960 | 1.8765 |
|
147 |
-
| 1.3456 | 29.79 | 970 | 1.8684 |
|
148 |
-
| 1.3054 | 30.1 | 980 | 1.8651 |
|
149 |
-
| 1.1838 | 30.4 | 990 | 1.8703 |
|
150 |
-
| 1.2867 | 30.71 | 1000 | 1.8729 |
|
151 |
-
| 1.3736 | 31.02 | 1010 | 1.8684 |
|
152 |
-
| 1.3065 | 31.32 | 1020 | 1.8667 |
|
153 |
-
| 1.2771 | 31.63 | 1030 | 1.8713 |
|
154 |
-
| 1.2562 | 31.94 | 1040 | 1.8730 |
|
155 |
-
| 1.3123 | 32.25 | 1050 | 1.8712 |
|
156 |
-
| 1.2836 | 32.55 | 1060 | 1.8692 |
|
157 |
-
| 1.271 | 32.86 | 1070 | 1.8687 |
|
158 |
-
| 1.269 | 33.17 | 1080 | 1.8685 |
|
159 |
-
| 1.2472 | 33.47 | 1090 | 1.8681 |
|
160 |
-
| 1.2518 | 33.78 | 1100 | 1.8687 |
|
161 |
-
| 1.3026 | 34.09 | 1110 | 1.8689 |
|
162 |
-
| 1.3021 | 34.4 | 1120 | 1.8689 |
|
163 |
|
164 |
|
165 |
### Framework versions
|
|
|
14 |
|
15 |
This model is a fine-tuned version of [google/pegasus-x-base](https://huggingface.co/google/pegasus-x-base) on the None dataset.
|
16 |
It achieves the following results on the evaluation set:
|
17 |
+
- Loss: 1.2446
|
18 |
|
19 |
## Model description
|
20 |
|
|
|
33 |
### Training hyperparameters
|
34 |
|
35 |
The following hyperparameters were used during training:
|
36 |
+
- learning_rate: 0.0001
|
37 |
- train_batch_size: 1
|
38 |
- eval_batch_size: 1
|
39 |
- seed: 42
|
40 |
- gradient_accumulation_steps: 16
|
41 |
- total_train_batch_size: 16
|
42 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
43 |
+
- lr_scheduler_type: reduce_lr_on_plateau
|
44 |
+
- num_epochs: 3
|
|
|
45 |
|
46 |
### Training results
|
47 |
|
48 |
| Training Loss | Epoch | Step | Validation Loss |
|
49 |
|:-------------:|:-----:|:----:|:---------------:|
|
50 |
+
| 4.0461 | 0.05 | 10 | 2.6611 |
|
51 |
+
| 2.8656 | 0.11 | 20 | 2.3313 |
|
52 |
+
| 2.5122 | 0.16 | 30 | 2.1502 |
|
53 |
+
| 2.3941 | 0.21 | 40 | 2.0071 |
|
54 |
+
| 2.2445 | 0.27 | 50 | 1.9110 |
|
55 |
+
| 2.205 | 0.32 | 60 | 1.8388 |
|
56 |
+
| 2.1341 | 0.37 | 70 | 1.7728 |
|
57 |
+
| 1.9793 | 0.43 | 80 | 1.7464 |
|
58 |
+
| 1.8616 | 0.48 | 90 | 1.6930 |
|
59 |
+
| 1.8848 | 0.53 | 100 | 1.6589 |
|
60 |
+
| 1.8432 | 0.59 | 110 | 1.6232 |
|
61 |
+
| 1.7926 | 0.64 | 120 | 1.5996 |
|
62 |
+
| 1.7956 | 0.69 | 130 | 1.5898 |
|
63 |
+
| 1.7327 | 0.75 | 140 | 1.5683 |
|
64 |
+
| 1.8604 | 0.8 | 150 | 1.5416 |
|
65 |
+
| 1.7607 | 0.85 | 160 | 1.5211 |
|
66 |
+
| 1.7807 | 0.91 | 170 | 1.5025 |
|
67 |
+
| 1.6985 | 0.96 | 180 | 1.4906 |
|
68 |
+
| 1.6284 | 1.01 | 190 | 1.4781 |
|
69 |
+
| 1.5689 | 1.07 | 200 | 1.4680 |
|
70 |
+
| 1.4443 | 1.12 | 210 | 1.4602 |
|
71 |
+
| 1.564 | 1.17 | 220 | 1.4439 |
|
72 |
+
| 1.4824 | 1.23 | 230 | 1.4327 |
|
73 |
+
| 1.4463 | 1.28 | 240 | 1.4247 |
|
74 |
+
| 1.5279 | 1.33 | 250 | 1.4195 |
|
75 |
+
| 1.4522 | 1.39 | 260 | 1.3928 |
|
76 |
+
| 1.5307 | 1.44 | 270 | 1.3943 |
|
77 |
+
| 1.4977 | 1.49 | 280 | 1.3779 |
|
78 |
+
| 1.5163 | 1.55 | 290 | 1.3756 |
|
79 |
+
| 1.4912 | 1.6 | 300 | 1.3558 |
|
80 |
+
| 1.5212 | 1.65 | 310 | 1.3539 |
|
81 |
+
| 1.4575 | 1.71 | 320 | 1.3424 |
|
82 |
+
| 1.3196 | 1.76 | 330 | 1.3386 |
|
83 |
+
| 1.3492 | 1.81 | 340 | 1.3257 |
|
84 |
+
| 1.4383 | 1.87 | 350 | 1.3248 |
|
85 |
+
| 1.4726 | 1.92 | 360 | 1.3168 |
|
86 |
+
| 1.3496 | 1.97 | 370 | 1.3117 |
|
87 |
+
| 1.2985 | 2.03 | 380 | 1.3181 |
|
88 |
+
| 1.1527 | 2.08 | 390 | 1.3094 |
|
89 |
+
| 1.2825 | 2.13 | 400 | 1.3112 |
|
90 |
+
| 1.2893 | 2.19 | 410 | 1.2984 |
|
91 |
+
| 1.2076 | 2.24 | 420 | 1.2880 |
|
92 |
+
| 1.3257 | 2.29 | 430 | 1.2904 |
|
93 |
+
| 1.3425 | 2.35 | 440 | 1.2756 |
|
94 |
+
| 1.2814 | 2.4 | 450 | 1.2806 |
|
95 |
+
| 1.3054 | 2.45 | 460 | 1.2782 |
|
96 |
+
| 1.1984 | 2.51 | 470 | 1.2767 |
|
97 |
+
| 1.2381 | 2.56 | 480 | 1.2653 |
|
98 |
+
| 1.1786 | 2.61 | 490 | 1.2712 |
|
99 |
+
| 1.1959 | 2.67 | 500 | 1.2534 |
|
100 |
+
| 1.2749 | 2.72 | 510 | 1.2548 |
|
101 |
+
| 1.2894 | 2.77 | 520 | 1.2531 |
|
102 |
+
| 1.2131 | 2.83 | 530 | 1.2561 |
|
103 |
+
| 1.226 | 2.88 | 540 | 1.2459 |
|
104 |
+
| 1.1534 | 2.93 | 550 | 1.2466 |
|
105 |
+
| 1.2492 | 2.99 | 560 | 1.2446 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
106 |
|
107 |
|
108 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1089213696
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dc8e447884be60e77149a73101e5f9b3088f3ed87e01898b51acc35cc8bbc921
|
3 |
size 1089213696
|
runs/Feb20_18-52-00_7d984f560b48/events.out.tfevents.1708455121.7d984f560b48.508.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3d587cb21b341f1b839d4f1244d3ee08b921a7049961c46a5def8e12e0cae208
|
3 |
+
size 29477
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:677a69c908eb2c08c8f550291b2d60f3196db02f79af837933ae8b09bbb85adc
|
3 |
size 4728
|