farleyknight commited on
Commit
3625b87
1 Parent(s): a21b010

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ datasets:
6
+ - arxiv-summarization
7
+ metrics:
8
+ - rouge
9
+ model-index:
10
+ - name: arxiv-summarization-t5-base-2022-09-21
11
+ results:
12
+ - task:
13
+ name: Sequence-to-sequence Language Modeling
14
+ type: text2text-generation
15
+ dataset:
16
+ name: arxiv-summarization
17
+ type: arxiv-summarization
18
+ config: section
19
+ split: train
20
+ args: section
21
+ metrics:
22
+ - name: Rouge1
23
+ type: rouge
24
+ value: 19.2884
25
+ ---
26
+
27
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
+ should probably proofread and complete it, then remove this comment. -->
29
+
30
+ # arxiv-summarization-t5-base-2022-09-21
31
+
32
+ This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the arxiv-summarization dataset.
33
+ It achieves the following results on the evaluation set:
34
+ - Loss: 1.8655
35
+ - Rouge1: 19.2884
36
+ - Rouge2: 7.8087
37
+ - Rougel: 15.4025
38
+ - Rougelsum: 17.5856
39
+ - Gen Len: 19.0
40
+
41
+ ## Model description
42
+
43
+ More information needed
44
+
45
+ ## Intended uses & limitations
46
+
47
+ More information needed
48
+
49
+ ## Training and evaluation data
50
+
51
+ More information needed
52
+
53
+ ## Training procedure
54
+
55
+ ### Training hyperparameters
56
+
57
+ The following hyperparameters were used during training:
58
+ - learning_rate: 5e-05
59
+ - train_batch_size: 1
60
+ - eval_batch_size: 1
61
+ - seed: 42
62
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
+ - lr_scheduler_type: linear
64
+ - num_epochs: 3.0
65
+
66
+ ### Training results
67
+
68
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
69
+ |:-------------:|:-----:|:------:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
70
+ | 2.3291 | 0.05 | 10000 | 2.1906 | 18.6571 | 7.1341 | 14.8347 | 16.9545 | 19.0 |
71
+ | 2.2454 | 0.1 | 20000 | 2.1549 | 18.5037 | 7.1908 | 14.7141 | 16.8233 | 18.9997 |
72
+ | 2.2107 | 0.15 | 30000 | 2.1013 | 18.7638 | 7.326 | 14.9437 | 17.072 | 19.0 |
73
+ | 2.1486 | 0.2 | 40000 | 2.0845 | 18.6879 | 7.2441 | 14.8835 | 16.983 | 19.0 |
74
+ | 2.158 | 0.25 | 50000 | 2.0699 | 18.8314 | 7.3712 | 15.0166 | 17.1215 | 19.0 |
75
+ | 2.1476 | 0.3 | 60000 | 2.0424 | 18.9783 | 7.4138 | 15.1121 | 17.2778 | 18.9981 |
76
+ | 2.1164 | 0.34 | 70000 | 2.0349 | 18.9257 | 7.4649 | 15.0335 | 17.1819 | 19.0 |
77
+ | 2.079 | 0.39 | 80000 | 2.0208 | 18.643 | 7.4096 | 14.8927 | 16.9786 | 18.9994 |
78
+ | 2.101 | 0.44 | 90000 | 2.0113 | 19.3881 | 7.7012 | 15.3981 | 17.6516 | 19.0 |
79
+ | 2.0576 | 0.49 | 100000 | 2.0022 | 18.9985 | 7.542 | 15.1157 | 17.2972 | 18.9992 |
80
+ | 2.0983 | 0.54 | 110000 | 1.9941 | 18.7691 | 7.4625 | 15.0256 | 17.1146 | 19.0 |
81
+ | 2.053 | 0.59 | 120000 | 1.9855 | 19.002 | 7.5602 | 15.1497 | 17.2963 | 19.0 |
82
+ | 2.0434 | 0.64 | 130000 | 1.9786 | 19.2385 | 7.6533 | 15.3094 | 17.5439 | 18.9994 |
83
+ | 2.0354 | 0.69 | 140000 | 1.9746 | 19.184 | 7.7307 | 15.2897 | 17.491 | 18.9992 |
84
+ | 2.0347 | 0.74 | 150000 | 1.9639 | 19.2408 | 7.693 | 15.3357 | 17.5297 | 19.0 |
85
+ | 2.0236 | 0.79 | 160000 | 1.9590 | 19.0781 | 7.6256 | 15.1932 | 17.3486 | 18.9998 |
86
+ | 2.0187 | 0.84 | 170000 | 1.9532 | 19.0343 | 7.6792 | 15.1884 | 17.3519 | 19.0 |
87
+ | 1.9939 | 0.89 | 180000 | 1.9485 | 18.8247 | 7.5005 | 15.0246 | 17.1485 | 18.9998 |
88
+ | 1.9961 | 0.94 | 190000 | 1.9504 | 19.0695 | 7.6559 | 15.2139 | 17.3814 | 19.0 |
89
+ | 2.0197 | 0.99 | 200000 | 1.9399 | 19.2821 | 7.6685 | 15.3029 | 17.5374 | 18.9988 |
90
+ | 1.9457 | 1.03 | 210000 | 1.9350 | 19.053 | 7.6502 | 15.2123 | 17.3793 | 19.0 |
91
+ | 1.9552 | 1.08 | 220000 | 1.9317 | 19.1878 | 7.7235 | 15.3272 | 17.5252 | 18.9998 |
92
+ | 1.9772 | 1.13 | 230000 | 1.9305 | 19.0855 | 7.6303 | 15.1943 | 17.3942 | 18.9997 |
93
+ | 1.9171 | 1.18 | 240000 | 1.9291 | 19.0711 | 7.6437 | 15.2175 | 17.3893 | 18.9995 |
94
+ | 1.9393 | 1.23 | 250000 | 1.9230 | 19.276 | 7.725 | 15.3826 | 17.586 | 18.9995 |
95
+ | 1.9295 | 1.28 | 260000 | 1.9197 | 19.2999 | 7.7958 | 15.3961 | 17.6056 | 18.9975 |
96
+ | 1.9725 | 1.33 | 270000 | 1.9173 | 19.2958 | 7.7121 | 15.3659 | 17.584 | 19.0 |
97
+ | 1.9668 | 1.38 | 280000 | 1.9129 | 19.089 | 7.6846 | 15.2395 | 17.3879 | 18.9998 |
98
+ | 1.941 | 1.43 | 290000 | 1.9132 | 19.2127 | 7.7336 | 15.311 | 17.4742 | 18.9995 |
99
+ | 1.9427 | 1.48 | 300000 | 1.9108 | 19.217 | 7.7591 | 15.334 | 17.53 | 18.9998 |
100
+ | 1.9521 | 1.53 | 310000 | 1.9041 | 19.1285 | 7.6736 | 15.2625 | 17.458 | 19.0 |
101
+ | 1.9352 | 1.58 | 320000 | 1.9041 | 19.1656 | 7.723 | 15.3035 | 17.4818 | 18.9991 |
102
+ | 1.9342 | 1.63 | 330000 | 1.9004 | 19.2573 | 7.7766 | 15.3558 | 17.5382 | 19.0 |
103
+ | 1.9631 | 1.68 | 340000 | 1.8978 | 19.236 | 7.7584 | 15.3408 | 17.4993 | 18.9998 |
104
+ | 1.8987 | 1.72 | 350000 | 1.8968 | 19.1716 | 7.7231 | 15.2836 | 17.4655 | 18.9997 |
105
+ | 1.9433 | 1.77 | 360000 | 1.8924 | 19.2644 | 7.8294 | 15.4018 | 17.5808 | 18.9998 |
106
+ | 1.9159 | 1.82 | 370000 | 1.8912 | 19.1833 | 7.8267 | 15.3175 | 17.4918 | 18.9995 |
107
+ | 1.9516 | 1.87 | 380000 | 1.8856 | 19.3077 | 7.7432 | 15.3723 | 17.6115 | 19.0 |
108
+ | 1.9218 | 1.92 | 390000 | 1.8880 | 19.2668 | 7.8231 | 15.3834 | 17.5701 | 18.9994 |
109
+ | 1.9159 | 1.97 | 400000 | 1.8860 | 19.2224 | 7.7903 | 15.3488 | 17.4992 | 18.9997 |
110
+ | 1.8741 | 2.02 | 410000 | 1.8854 | 19.2572 | 7.741 | 15.3405 | 17.5351 | 19.0 |
111
+ | 1.8668 | 2.07 | 420000 | 1.8854 | 19.3658 | 7.8593 | 15.4418 | 17.656 | 18.9995 |
112
+ | 1.8638 | 2.12 | 430000 | 1.8831 | 19.305 | 7.8218 | 15.3843 | 17.5861 | 18.9997 |
113
+ | 1.8334 | 2.17 | 440000 | 1.8817 | 19.3269 | 7.8249 | 15.4231 | 17.5958 | 18.9994 |
114
+ | 1.8893 | 2.22 | 450000 | 1.8803 | 19.2949 | 7.7885 | 15.3947 | 17.585 | 18.9997 |
115
+ | 1.8929 | 2.27 | 460000 | 1.8783 | 19.291 | 7.8346 | 15.428 | 17.5797 | 18.9997 |
116
+ | 1.861 | 2.32 | 470000 | 1.8766 | 19.4284 | 7.8832 | 15.4746 | 17.6946 | 18.9997 |
117
+ | 1.8719 | 2.37 | 480000 | 1.8751 | 19.1525 | 7.7641 | 15.3348 | 17.47 | 18.9998 |
118
+ | 1.8889 | 2.41 | 490000 | 1.8742 | 19.1743 | 7.768 | 15.3292 | 17.4665 | 18.9998 |
119
+ | 1.8834 | 2.46 | 500000 | 1.8723 | 19.3069 | 7.7935 | 15.3987 | 17.5913 | 18.9998 |
120
+ | 1.8564 | 2.51 | 510000 | 1.8695 | 19.3217 | 7.8292 | 15.4063 | 17.6081 | 19.0 |
121
+ | 1.8706 | 2.56 | 520000 | 1.8697 | 19.294 | 7.8217 | 15.3964 | 17.581 | 18.9998 |
122
+ | 1.883 | 2.61 | 530000 | 1.8703 | 19.2784 | 7.8634 | 15.404 | 17.5942 | 18.9995 |
123
+ | 1.8622 | 2.66 | 540000 | 1.8677 | 19.3165 | 7.8378 | 15.4259 | 17.6064 | 18.9988 |
124
+ | 1.8781 | 2.71 | 550000 | 1.8676 | 19.3237 | 7.7954 | 15.3995 | 17.6008 | 19.0 |
125
+ | 1.8793 | 2.76 | 560000 | 1.8685 | 19.2141 | 7.7605 | 15.3345 | 17.5268 | 18.9997 |
126
+ | 1.8795 | 2.81 | 570000 | 1.8675 | 19.2694 | 7.8082 | 15.3996 | 17.5831 | 19.0 |
127
+ | 1.8425 | 2.86 | 580000 | 1.8659 | 19.2886 | 7.7987 | 15.4005 | 17.5859 | 18.9997 |
128
+ | 1.8605 | 2.91 | 590000 | 1.8650 | 19.2778 | 7.7934 | 15.3931 | 17.5809 | 18.9997 |
129
+ | 1.8448 | 2.96 | 600000 | 1.8655 | 19.2884 | 7.8087 | 15.4025 | 17.5856 | 19.0 |
130
+
131
+
132
+ ### Framework versions
133
+
134
+ - Transformers 4.23.0.dev0
135
+ - Pytorch 1.12.0
136
+ - Datasets 2.5.1
137
+ - Tokenizers 0.13.0