RMWeerasinghe
/

flan-t5-base-prompt_tuning-cnn-dailymail

PEFT

Safetensors

Summarization

Generated from Trainer

Model card Files Files and versions Community

RMWeerasinghe commited on Feb 22

Commit

933dad0

•

1 Parent(s): 0c8ea99

Training complete

Browse files

Files changed (2) hide show

README.md +107 -0
adapter_model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,107 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- Summarization
+- generated_from_trainer
+datasets:
+- cnn_dailymail
+metrics:
+- rouge
+base_model: google/flan-t5-base
+model-index:
+- name: flan-t5-base-prompt_tuning-cnn-dailymail
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# flan-t5-base-prompt_tuning-cnn-dailymail
+This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the cnn_dailymail dataset.
+It achieves the following results on the evaluation set:
+- Loss: 19.3074
+- Rouge1: 0.0787
+- Rouge2: 0.0088
+- Rougel: 0.0609
+- Rougelsum: 0.0733
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.03
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 40
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 20.9307       | 1.0   | 188  | 19.4344         | 0.1471 | 0.0433 | 0.1103 | 0.1337    |
+| 20.4274       | 2.0   | 376  | 20.1245         | 0.1199 | 0.0299 | 0.0953 | 0.1135    |
+| 20.1641       | 3.0   | 564  | 19.5964         | 0.1178 | 0.024  | 0.0909 | 0.1072    |
+| 20.5294       | 4.0   | 752  | 19.2955         | 0.1164 | 0.0213 | 0.0882 | 0.1055    |
+| 20.6452       | 5.0   | 940  | 19.4288         | 0.1179 | 0.0239 | 0.0895 | 0.1072    |
+| 20.6916       | 6.0   | 1128 | 19.1208         | 0.0997 | 0.0186 | 0.0795 | 0.093     |
+| 20.8065       | 7.0   | 1316 | 18.9300         | 0.0865 | 0.0116 | 0.0688 | 0.08      |
+| 20.1431       | 8.0   | 1504 | 19.7751         | 0.1118 | 0.0247 | 0.0869 | 0.1023    |
+| 20.5281       | 9.0   | 1692 | 20.0590         | 0.1216 | 0.0278 | 0.0923 | 0.1118    |
+| 20.1805       | 10.0  | 1880 | 19.3949         | 0.1025 | 0.0145 | 0.0818 | 0.0948    |
+| 20.4289       | 11.0  | 2068 | 19.1645         | 0.0844 | 0.0086 | 0.0656 | 0.0753    |
+| 20.1469       | 12.0  | 2256 | 19.4850         | 0.0905 | 0.0062 | 0.0697 | 0.0831    |
+| 20.9285       | 13.0  | 2444 | 19.3351         | 0.0853 | 0.0077 | 0.067  | 0.0785    |
+| 20.1419       | 14.0  | 2632 | 19.1241         | 0.0886 | 0.0097 | 0.0684 | 0.0822    |
+| 20.5547       | 15.0  | 2820 | 19.1532         | 0.0897 | 0.0077 | 0.0704 | 0.0804    |
+| 19.5719       | 16.0  | 3008 | 19.2346         | 0.0885 | 0.0107 | 0.0659 | 0.0794    |
+| 20.3043       | 17.0  | 3196 | 19.3873         | 0.105  | 0.0188 | 0.0829 | 0.0949    |
+| 20.5935       | 18.0  | 3384 | 19.3345         | 0.1132 | 0.0203 | 0.0874 | 0.1025    |
+| 20.413        | 19.0  | 3572 | 18.8964         | 0.0751 | 0.0065 | 0.0593 | 0.0686    |
+| 19.9286       | 20.0  | 3760 | 18.8474         | 0.0813 | 0.0082 | 0.0648 | 0.0725    |
+| 19.9246       | 21.0  | 3948 | 19.3425         | 0.0844 | 0.0096 | 0.0694 | 0.0765    |
+| 20.4844       | 22.0  | 4136 | 19.4680         | 0.1012 | 0.0143 | 0.0782 | 0.0923    |
+| 20.1571       | 23.0  | 4324 | 19.5483         | 0.0808 | 0.0093 | 0.0665 | 0.0762    |
+| 20.0099       | 24.0  | 4512 | 18.5052         | 0.056  | 0.0029 | 0.0479 | 0.0516    |
+| 19.6279       | 25.0  | 4700 | 18.7629         | 0.0735 | 0.0082 | 0.0603 | 0.0649    |
+| 19.303        | 26.0  | 4888 | 19.3608         | 0.1015 | 0.0124 | 0.0766 | 0.0885    |
+| 20.8774       | 27.0  | 5076 | 19.3038         | 0.1008 | 0.013  | 0.0807 | 0.0932    |
+| 20.1431       | 28.0  | 5264 | 19.3426         | 0.0991 | 0.0156 | 0.078  | 0.0918    |
+| 20.4304       | 29.0  | 5452 | 19.3918         | 0.0905 | 0.0102 | 0.0734 | 0.0812    |
+| 19.6689       | 30.0  | 5640 | 19.3527         | 0.088  | 0.0105 | 0.0669 | 0.0785    |
+| 20.661        | 31.0  | 5828 | 19.4042         | 0.0996 | 0.0149 | 0.0767 | 0.0887    |
+| 20.2962       | 32.0  | 6016 | 19.3871         | 0.0758 | 0.0101 | 0.0617 | 0.0702    |
+| 20.5865       | 33.0  | 6204 | 19.3255         | 0.0786 | 0.0106 | 0.064  | 0.0733    |
+| 21.4763       | 34.0  | 6392 | 19.3113         | 0.0755 | 0.0087 | 0.0623 | 0.0688    |
+| 21.3826       | 35.0  | 6580 | 19.3089         | 0.075  | 0.0076 | 0.0609 | 0.0689    |
+| 20.8869       | 36.0  | 6768 | 19.3614         | 0.0906 | 0.0143 | 0.0692 | 0.0812    |
+| 20.527        | 37.0  | 6956 | 19.3784         | 0.0874 | 0.0099 | 0.0686 | 0.0797    |
+| 19.5026       | 38.0  | 7144 | 19.4145         | 0.0888 | 0.0111 | 0.068  | 0.0823    |
+| 19.3852       | 39.0  | 7332 | 19.3794         | 0.0815 | 0.0093 | 0.0616 | 0.0742    |
+| 20.5347       | 40.0  | 7520 | 19.3074         | 0.0787 | 0.0088 | 0.0609 | 0.0733    |
+### Framework versions
+- PEFT 0.8.2
+- Transformers 4.37.0
+- Pytorch 2.1.2
+- Datasets 2.1.0
+- Tokenizers 0.15.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6c01392b3366979b0489325f9c37b8b11fd9e68cc387c69568ae3daa6a867a7d
 size 61560

 version https://git-lfs.github.com/spec/v1
+oid sha256:65f037f4222d27de723890d6247d069fdb2d758eda50df29c2a3f7747c95d835
 size 61560