RMWeerasinghe commited on
Commit
a50b006
1 Parent(s): 6a14270

Training complete

Browse files
Files changed (3) hide show
  1. README.md +105 -0
  2. generation_config.json +6 -0
  3. model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Falconsai/text_summarization
4
+ tags:
5
+ - summarization
6
+ - generated_from_trainer
7
+ datasets:
8
+ - cnn_dailymail
9
+ metrics:
10
+ - rouge
11
+ model-index:
12
+ - name: text_summarization-finetuned
13
+ results:
14
+ - task:
15
+ name: Sequence-to-sequence Language Modeling
16
+ type: text2text-generation
17
+ dataset:
18
+ name: cnn_dailymail
19
+ type: cnn_dailymail
20
+ config: 1.0.0
21
+ split: validation
22
+ args: 1.0.0
23
+ metrics:
24
+ - name: Rouge1
25
+ type: rouge
26
+ value: 0.2339
27
+ ---
28
+
29
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
+ should probably proofread and complete it, then remove this comment. -->
31
+
32
+ # text_summarization-finetuned
33
+
34
+ This model is a fine-tuned version of [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization) on the cnn_dailymail dataset.
35
+ It achieves the following results on the evaluation set:
36
+ - Loss: 2.5462
37
+ - Rouge1: 0.2339
38
+ - Rouge2: 0.1071
39
+ - Rougel: 0.1909
40
+ - Rougelsum: 0.2199
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 2e-05
60
+ - train_batch_size: 8
61
+ - eval_batch_size: 8
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 4
64
+ - total_train_batch_size: 32
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - num_epochs: 25
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
72
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
73
+ | 14.8371 | 0.99 | 31 | 10.4178 | 0.2031 | 0.0864 | 0.1631 | 0.1907 |
74
+ | 11.0708 | 1.98 | 62 | 8.1794 | 0.2049 | 0.0873 | 0.1642 | 0.1909 |
75
+ | 9.5037 | 2.98 | 93 | 5.3342 | 0.1989 | 0.0804 | 0.1559 | 0.1845 |
76
+ | 6.2278 | 4.0 | 125 | 4.4009 | 0.201 | 0.0855 | 0.1571 | 0.1882 |
77
+ | 5.152 | 4.99 | 156 | 3.4913 | 0.2094 | 0.0883 | 0.1668 | 0.1959 |
78
+ | 3.9293 | 5.98 | 187 | 3.0893 | 0.2221 | 0.0957 | 0.1785 | 0.2083 |
79
+ | 3.6608 | 6.98 | 218 | 2.9988 | 0.2174 | 0.0948 | 0.1775 | 0.2045 |
80
+ | 3.3943 | 8.0 | 250 | 2.9427 | 0.2195 | 0.0959 | 0.179 | 0.2064 |
81
+ | 3.2549 | 8.99 | 281 | 2.9013 | 0.2255 | 0.1002 | 0.1832 | 0.2124 |
82
+ | 3.2028 | 9.98 | 312 | 2.8655 | 0.2298 | 0.1053 | 0.1865 | 0.2165 |
83
+ | 3.1611 | 10.98 | 343 | 2.8306 | 0.2302 | 0.1069 | 0.1878 | 0.218 |
84
+ | 3.1206 | 12.0 | 375 | 2.7931 | 0.2265 | 0.1044 | 0.1847 | 0.2142 |
85
+ | 3.0716 | 12.99 | 406 | 2.7572 | 0.2301 | 0.1077 | 0.1883 | 0.2173 |
86
+ | 3.0376 | 13.98 | 437 | 2.7239 | 0.231 | 0.1057 | 0.1883 | 0.2177 |
87
+ | 3.0154 | 14.98 | 468 | 2.6894 | 0.2319 | 0.1062 | 0.1891 | 0.2177 |
88
+ | 2.9518 | 16.0 | 500 | 2.6593 | 0.233 | 0.1071 | 0.1904 | 0.2192 |
89
+ | 2.9359 | 16.99 | 531 | 2.6332 | 0.2338 | 0.108 | 0.1919 | 0.2208 |
90
+ | 2.8874 | 17.98 | 562 | 2.6124 | 0.2322 | 0.1057 | 0.1896 | 0.2181 |
91
+ | 2.8786 | 18.98 | 593 | 2.5941 | 0.2335 | 0.1066 | 0.1909 | 0.2196 |
92
+ | 2.8584 | 20.0 | 625 | 2.5782 | 0.232 | 0.1056 | 0.1895 | 0.2178 |
93
+ | 2.8517 | 20.99 | 656 | 2.5671 | 0.2327 | 0.1061 | 0.1901 | 0.2188 |
94
+ | 2.8392 | 21.98 | 687 | 2.5562 | 0.2339 | 0.1067 | 0.1908 | 0.2198 |
95
+ | 2.8478 | 22.98 | 718 | 2.5509 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
96
+ | 2.8161 | 24.0 | 750 | 2.5469 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
97
+ | 2.8385 | 24.8 | 775 | 2.5462 | 0.2339 | 0.1071 | 0.1909 | 0.2199 |
98
+
99
+
100
+ ### Framework versions
101
+
102
+ - Transformers 4.38.0.dev0
103
+ - Pytorch 2.2.0
104
+ - Datasets 2.16.1
105
+ - Tokenizers 0.15.1
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.38.0.dev0"
6
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d720e0618e7738df1f53c971844bf6c4be4ab01006bc067de8d3dd297a9132bd
3
  size 242041896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16c8d5884f26c88b62d267e1816de4c89bc02c674ac2ddd1b8838ddba37804b4
3
  size 242041896