RMWeerasinghe commited on
Commit
196921a
1 Parent(s): e215eb7

Training complete

Browse files
Files changed (4) hide show
  1. README.md +48 -34
  2. model.safetensors +1 -1
  3. tokenizer.json +11 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -23,8 +23,7 @@ model-index:
23
  metrics:
24
  - name: Rouge1
25
  type: rouge
26
- value: 0.2339
27
- pipeline_tag: summarization
28
  ---
29
 
30
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -34,11 +33,11 @@ should probably proofread and complete it, then remove this comment. -->
34
 
35
  This model is a fine-tuned version of [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization) on the cnn_dailymail dataset.
36
  It achieves the following results on the evaluation set:
37
- - Loss: 2.5462
38
- - Rouge1: 0.2339
39
- - Rouge2: 0.1071
40
- - Rougel: 0.1909
41
- - Rougelsum: 0.2199
42
 
43
  ## Model description
44
 
@@ -65,37 +64,52 @@ The following hyperparameters were used during training:
65
  - total_train_batch_size: 32
66
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
67
  - lr_scheduler_type: linear
68
- - num_epochs: 25
69
 
70
  ### Training results
71
 
72
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
73
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
74
- | 14.8371 | 0.99 | 31 | 10.4178 | 0.2031 | 0.0864 | 0.1631 | 0.1907 |
75
- | 11.0708 | 1.98 | 62 | 8.1794 | 0.2049 | 0.0873 | 0.1642 | 0.1909 |
76
- | 9.5037 | 2.98 | 93 | 5.3342 | 0.1989 | 0.0804 | 0.1559 | 0.1845 |
77
- | 6.2278 | 4.0 | 125 | 4.4009 | 0.201 | 0.0855 | 0.1571 | 0.1882 |
78
- | 5.152 | 4.99 | 156 | 3.4913 | 0.2094 | 0.0883 | 0.1668 | 0.1959 |
79
- | 3.9293 | 5.98 | 187 | 3.0893 | 0.2221 | 0.0957 | 0.1785 | 0.2083 |
80
- | 3.6608 | 6.98 | 218 | 2.9988 | 0.2174 | 0.0948 | 0.1775 | 0.2045 |
81
- | 3.3943 | 8.0 | 250 | 2.9427 | 0.2195 | 0.0959 | 0.179 | 0.2064 |
82
- | 3.2549 | 8.99 | 281 | 2.9013 | 0.2255 | 0.1002 | 0.1832 | 0.2124 |
83
- | 3.2028 | 9.98 | 312 | 2.8655 | 0.2298 | 0.1053 | 0.1865 | 0.2165 |
84
- | 3.1611 | 10.98 | 343 | 2.8306 | 0.2302 | 0.1069 | 0.1878 | 0.218 |
85
- | 3.1206 | 12.0 | 375 | 2.7931 | 0.2265 | 0.1044 | 0.1847 | 0.2142 |
86
- | 3.0716 | 12.99 | 406 | 2.7572 | 0.2301 | 0.1077 | 0.1883 | 0.2173 |
87
- | 3.0376 | 13.98 | 437 | 2.7239 | 0.231 | 0.1057 | 0.1883 | 0.2177 |
88
- | 3.0154 | 14.98 | 468 | 2.6894 | 0.2319 | 0.1062 | 0.1891 | 0.2177 |
89
- | 2.9518 | 16.0 | 500 | 2.6593 | 0.233 | 0.1071 | 0.1904 | 0.2192 |
90
- | 2.9359 | 16.99 | 531 | 2.6332 | 0.2338 | 0.108 | 0.1919 | 0.2208 |
91
- | 2.8874 | 17.98 | 562 | 2.6124 | 0.2322 | 0.1057 | 0.1896 | 0.2181 |
92
- | 2.8786 | 18.98 | 593 | 2.5941 | 0.2335 | 0.1066 | 0.1909 | 0.2196 |
93
- | 2.8584 | 20.0 | 625 | 2.5782 | 0.232 | 0.1056 | 0.1895 | 0.2178 |
94
- | 2.8517 | 20.99 | 656 | 2.5671 | 0.2327 | 0.1061 | 0.1901 | 0.2188 |
95
- | 2.8392 | 21.98 | 687 | 2.5562 | 0.2339 | 0.1067 | 0.1908 | 0.2198 |
96
- | 2.8478 | 22.98 | 718 | 2.5509 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
97
- | 2.8161 | 24.0 | 750 | 2.5469 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
98
- | 2.8385 | 24.8 | 775 | 2.5462 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
 
100
 
101
  ### Framework versions
@@ -103,4 +117,4 @@ The following hyperparameters were used during training:
103
  - Transformers 4.38.0.dev0
104
  - Pytorch 2.2.0
105
  - Datasets 2.16.1
106
- - Tokenizers 0.15.1
 
23
  metrics:
24
  - name: Rouge1
25
  type: rouge
26
+ value: 0.2389
 
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
33
 
34
  This model is a fine-tuned version of [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization) on the cnn_dailymail dataset.
35
  It achieves the following results on the evaluation set:
36
+ - Loss: 1.8119
37
+ - Rouge1: 0.2389
38
+ - Rouge2: 0.1112
39
+ - Rougel: 0.1946
40
+ - Rougelsum: 0.2237
41
 
42
  ## Model description
43
 
 
64
  - total_train_batch_size: 32
65
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 40
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
73
+ | 10.7536 | 1.0 | 78 | 6.6776 | 0.203 | 0.0868 | 0.1627 | 0.1909 |
74
+ | 5.0057 | 1.99 | 156 | 3.2391 | 0.2128 | 0.0909 | 0.1707 | 0.2003 |
75
+ | 3.3921 | 2.99 | 234 | 2.9233 | 0.2263 | 0.102 | 0.1849 | 0.213 |
76
+ | 3.1013 | 4.0 | 313 | 2.7724 | 0.2265 | 0.1043 | 0.1864 | 0.2128 |
77
+ | 2.9643 | 5.0 | 391 | 2.5935 | 0.2305 | 0.1075 | 0.1893 | 0.2166 |
78
+ | 2.7594 | 5.99 | 469 | 2.4411 | 0.2311 | 0.1075 | 0.1888 | 0.2171 |
79
+ | 2.6579 | 6.99 | 547 | 2.3273 | 0.2327 | 0.1084 | 0.1908 | 0.2185 |
80
+ | 2.5729 | 8.0 | 626 | 2.2452 | 0.2326 | 0.1083 | 0.1905 | 0.2185 |
81
+ | 2.4879 | 9.0 | 704 | 2.1828 | 0.2313 | 0.1063 | 0.1893 | 0.2176 |
82
+ | 2.401 | 9.99 | 782 | 2.1365 | 0.2336 | 0.1071 | 0.1907 | 0.2193 |
83
+ | 2.346 | 10.99 | 860 | 2.0937 | 0.2332 | 0.1065 | 0.1905 | 0.2192 |
84
+ | 2.3086 | 12.0 | 939 | 2.0606 | 0.2334 | 0.107 | 0.1905 | 0.2191 |
85
+ | 2.2648 | 13.0 | 1017 | 2.0315 | 0.2351 | 0.1085 | 0.1925 | 0.2211 |
86
+ | 2.2452 | 13.99 | 1095 | 2.0058 | 0.2354 | 0.1079 | 0.1922 | 0.221 |
87
+ | 2.204 | 14.99 | 1173 | 1.9853 | 0.2364 | 0.1093 | 0.1932 | 0.2222 |
88
+ | 2.1723 | 16.0 | 1252 | 1.9665 | 0.236 | 0.109 | 0.1931 | 0.2218 |
89
+ | 2.1601 | 17.0 | 1330 | 1.9479 | 0.2356 | 0.109 | 0.1923 | 0.2212 |
90
+ | 2.143 | 17.99 | 1408 | 1.9337 | 0.2356 | 0.1093 | 0.1926 | 0.2215 |
91
+ | 2.093 | 18.99 | 1486 | 1.9201 | 0.2366 | 0.1101 | 0.193 | 0.2223 |
92
+ | 2.0987 | 20.0 | 1565 | 1.9077 | 0.2371 | 0.111 | 0.1938 | 0.2228 |
93
+ | 2.0663 | 21.0 | 1643 | 1.8956 | 0.2368 | 0.1104 | 0.1937 | 0.2219 |
94
+ | 2.0629 | 21.99 | 1721 | 1.8858 | 0.2375 | 0.1109 | 0.1935 | 0.2221 |
95
+ | 2.0449 | 22.99 | 1799 | 1.8765 | 0.2395 | 0.1128 | 0.1959 | 0.2244 |
96
+ | 2.0342 | 24.0 | 1878 | 1.8684 | 0.2384 | 0.1115 | 0.1943 | 0.2233 |
97
+ | 2.0021 | 25.0 | 1956 | 1.8620 | 0.2373 | 0.1101 | 0.1932 | 0.222 |
98
+ | 2.0152 | 25.99 | 2034 | 1.8537 | 0.2387 | 0.1116 | 0.1949 | 0.2236 |
99
+ | 2.0058 | 26.99 | 2112 | 1.8477 | 0.239 | 0.1118 | 0.195 | 0.224 |
100
+ | 1.981 | 28.0 | 2191 | 1.8418 | 0.2377 | 0.1108 | 0.194 | 0.2227 |
101
+ | 1.9493 | 29.0 | 2269 | 1.8358 | 0.2388 | 0.111 | 0.1947 | 0.2234 |
102
+ | 1.9626 | 29.99 | 2347 | 1.8314 | 0.2385 | 0.1109 | 0.1945 | 0.223 |
103
+ | 1.9735 | 30.99 | 2425 | 1.8279 | 0.239 | 0.1109 | 0.1944 | 0.2232 |
104
+ | 1.9421 | 32.0 | 2504 | 1.8240 | 0.2393 | 0.1109 | 0.1946 | 0.2234 |
105
+ | 1.9371 | 33.0 | 2582 | 1.8212 | 0.2396 | 0.1114 | 0.1951 | 0.2239 |
106
+ | 1.9252 | 33.99 | 2660 | 1.8184 | 0.2392 | 0.1111 | 0.1947 | 0.2238 |
107
+ | 1.9556 | 34.99 | 2738 | 1.8163 | 0.2392 | 0.1111 | 0.1946 | 0.2238 |
108
+ | 1.9436 | 36.0 | 2817 | 1.8147 | 0.2394 | 0.111 | 0.1945 | 0.224 |
109
+ | 1.9444 | 37.0 | 2895 | 1.8132 | 0.239 | 0.1113 | 0.1946 | 0.2239 |
110
+ | 1.9368 | 37.99 | 2973 | 1.8125 | 0.239 | 0.1112 | 0.1947 | 0.2239 |
111
+ | 1.9467 | 38.99 | 3051 | 1.8120 | 0.2389 | 0.1112 | 0.1946 | 0.2237 |
112
+ | 1.9335 | 39.87 | 3120 | 1.8119 | 0.2389 | 0.1112 | 0.1946 | 0.2237 |
113
 
114
 
115
  ### Framework versions
 
117
  - Transformers 4.38.0.dev0
118
  - Pytorch 2.2.0
119
  - Datasets 2.16.1
120
+ - Tokenizers 0.15.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16c8d5884f26c88b62d267e1816de4c89bc02c674ac2ddd1b8838ddba37804b4
3
  size 242041896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36b2104c92ddbaefec4132c4a63efcfc0be7bf6ae077179c122a0d7e7ccbf9f1
3
  size 242041896
tokenizer.json CHANGED
@@ -2,11 +2,20 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 256,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
- "padding": null,
 
 
 
 
 
 
 
 
 
10
  "added_tokens": [
11
  {
12
  "id": 0,
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 128,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
+ "padding": {
10
+ "strategy": {
11
+ "Fixed": 128
12
+ },
13
+ "direction": "Right",
14
+ "pad_to_multiple_of": null,
15
+ "pad_id": 32100,
16
+ "pad_type_id": 0,
17
+ "pad_token": "[PAD]"
18
+ },
19
  "added_tokens": [
20
  {
21
  "id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2b459a461e6830add1f693dd13bbdc999e9b82ab455e769a124c087a69f33fa6
3
  size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e34a9ca41afee90b86d11ab433e55c78a00c2edc71eea29846ce215185e55cee
3
  size 4856