RMWeerasinghe commited on
Commit
933dad0
1 Parent(s): 0c8ea99

Training complete

Browse files
Files changed (2) hide show
  1. README.md +107 -0
  2. adapter_model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - Summarization
6
+ - generated_from_trainer
7
+ datasets:
8
+ - cnn_dailymail
9
+ metrics:
10
+ - rouge
11
+ base_model: google/flan-t5-base
12
+ model-index:
13
+ - name: flan-t5-base-prompt_tuning-cnn-dailymail
14
+ results: []
15
+ ---
16
+
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ # flan-t5-base-prompt_tuning-cnn-dailymail
21
+
22
+ This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the cnn_dailymail dataset.
23
+ It achieves the following results on the evaluation set:
24
+ - Loss: 19.3074
25
+ - Rouge1: 0.0787
26
+ - Rouge2: 0.0088
27
+ - Rougel: 0.0609
28
+ - Rougelsum: 0.0733
29
+
30
+ ## Model description
31
+
32
+ More information needed
33
+
34
+ ## Intended uses & limitations
35
+
36
+ More information needed
37
+
38
+ ## Training and evaluation data
39
+
40
+ More information needed
41
+
42
+ ## Training procedure
43
+
44
+ ### Training hyperparameters
45
+
46
+ The following hyperparameters were used during training:
47
+ - learning_rate: 0.03
48
+ - train_batch_size: 8
49
+ - eval_batch_size: 8
50
+ - seed: 42
51
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
+ - lr_scheduler_type: linear
53
+ - num_epochs: 40
54
+
55
+ ### Training results
56
+
57
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
58
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
59
+ | 20.9307 | 1.0 | 188 | 19.4344 | 0.1471 | 0.0433 | 0.1103 | 0.1337 |
60
+ | 20.4274 | 2.0 | 376 | 20.1245 | 0.1199 | 0.0299 | 0.0953 | 0.1135 |
61
+ | 20.1641 | 3.0 | 564 | 19.5964 | 0.1178 | 0.024 | 0.0909 | 0.1072 |
62
+ | 20.5294 | 4.0 | 752 | 19.2955 | 0.1164 | 0.0213 | 0.0882 | 0.1055 |
63
+ | 20.6452 | 5.0 | 940 | 19.4288 | 0.1179 | 0.0239 | 0.0895 | 0.1072 |
64
+ | 20.6916 | 6.0 | 1128 | 19.1208 | 0.0997 | 0.0186 | 0.0795 | 0.093 |
65
+ | 20.8065 | 7.0 | 1316 | 18.9300 | 0.0865 | 0.0116 | 0.0688 | 0.08 |
66
+ | 20.1431 | 8.0 | 1504 | 19.7751 | 0.1118 | 0.0247 | 0.0869 | 0.1023 |
67
+ | 20.5281 | 9.0 | 1692 | 20.0590 | 0.1216 | 0.0278 | 0.0923 | 0.1118 |
68
+ | 20.1805 | 10.0 | 1880 | 19.3949 | 0.1025 | 0.0145 | 0.0818 | 0.0948 |
69
+ | 20.4289 | 11.0 | 2068 | 19.1645 | 0.0844 | 0.0086 | 0.0656 | 0.0753 |
70
+ | 20.1469 | 12.0 | 2256 | 19.4850 | 0.0905 | 0.0062 | 0.0697 | 0.0831 |
71
+ | 20.9285 | 13.0 | 2444 | 19.3351 | 0.0853 | 0.0077 | 0.067 | 0.0785 |
72
+ | 20.1419 | 14.0 | 2632 | 19.1241 | 0.0886 | 0.0097 | 0.0684 | 0.0822 |
73
+ | 20.5547 | 15.0 | 2820 | 19.1532 | 0.0897 | 0.0077 | 0.0704 | 0.0804 |
74
+ | 19.5719 | 16.0 | 3008 | 19.2346 | 0.0885 | 0.0107 | 0.0659 | 0.0794 |
75
+ | 20.3043 | 17.0 | 3196 | 19.3873 | 0.105 | 0.0188 | 0.0829 | 0.0949 |
76
+ | 20.5935 | 18.0 | 3384 | 19.3345 | 0.1132 | 0.0203 | 0.0874 | 0.1025 |
77
+ | 20.413 | 19.0 | 3572 | 18.8964 | 0.0751 | 0.0065 | 0.0593 | 0.0686 |
78
+ | 19.9286 | 20.0 | 3760 | 18.8474 | 0.0813 | 0.0082 | 0.0648 | 0.0725 |
79
+ | 19.9246 | 21.0 | 3948 | 19.3425 | 0.0844 | 0.0096 | 0.0694 | 0.0765 |
80
+ | 20.4844 | 22.0 | 4136 | 19.4680 | 0.1012 | 0.0143 | 0.0782 | 0.0923 |
81
+ | 20.1571 | 23.0 | 4324 | 19.5483 | 0.0808 | 0.0093 | 0.0665 | 0.0762 |
82
+ | 20.0099 | 24.0 | 4512 | 18.5052 | 0.056 | 0.0029 | 0.0479 | 0.0516 |
83
+ | 19.6279 | 25.0 | 4700 | 18.7629 | 0.0735 | 0.0082 | 0.0603 | 0.0649 |
84
+ | 19.303 | 26.0 | 4888 | 19.3608 | 0.1015 | 0.0124 | 0.0766 | 0.0885 |
85
+ | 20.8774 | 27.0 | 5076 | 19.3038 | 0.1008 | 0.013 | 0.0807 | 0.0932 |
86
+ | 20.1431 | 28.0 | 5264 | 19.3426 | 0.0991 | 0.0156 | 0.078 | 0.0918 |
87
+ | 20.4304 | 29.0 | 5452 | 19.3918 | 0.0905 | 0.0102 | 0.0734 | 0.0812 |
88
+ | 19.6689 | 30.0 | 5640 | 19.3527 | 0.088 | 0.0105 | 0.0669 | 0.0785 |
89
+ | 20.661 | 31.0 | 5828 | 19.4042 | 0.0996 | 0.0149 | 0.0767 | 0.0887 |
90
+ | 20.2962 | 32.0 | 6016 | 19.3871 | 0.0758 | 0.0101 | 0.0617 | 0.0702 |
91
+ | 20.5865 | 33.0 | 6204 | 19.3255 | 0.0786 | 0.0106 | 0.064 | 0.0733 |
92
+ | 21.4763 | 34.0 | 6392 | 19.3113 | 0.0755 | 0.0087 | 0.0623 | 0.0688 |
93
+ | 21.3826 | 35.0 | 6580 | 19.3089 | 0.075 | 0.0076 | 0.0609 | 0.0689 |
94
+ | 20.8869 | 36.0 | 6768 | 19.3614 | 0.0906 | 0.0143 | 0.0692 | 0.0812 |
95
+ | 20.527 | 37.0 | 6956 | 19.3784 | 0.0874 | 0.0099 | 0.0686 | 0.0797 |
96
+ | 19.5026 | 38.0 | 7144 | 19.4145 | 0.0888 | 0.0111 | 0.068 | 0.0823 |
97
+ | 19.3852 | 39.0 | 7332 | 19.3794 | 0.0815 | 0.0093 | 0.0616 | 0.0742 |
98
+ | 20.5347 | 40.0 | 7520 | 19.3074 | 0.0787 | 0.0088 | 0.0609 | 0.0733 |
99
+
100
+
101
+ ### Framework versions
102
+
103
+ - PEFT 0.8.2
104
+ - Transformers 4.37.0
105
+ - Pytorch 2.1.2
106
+ - Datasets 2.1.0
107
+ - Tokenizers 0.15.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6c01392b3366979b0489325f9c37b8b11fd9e68cc387c69568ae3daa6a867a7d
3
  size 61560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65f037f4222d27de723890d6247d069fdb2d758eda50df29c2a3f7747c95d835
3
  size 61560