Zekunli
/

flan-t5-large-extraction-cnndm_20000-all

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

flan-t5-large-extraction-cnndm_20000-all / README.md

Zekunli's picture

update model card README.md

cfb4a97 over 1 year ago

|

history blame contribute delete

No virus

3.01 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flan-t5-large-extraction-cnndm_20000-all
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-large-extraction-cnndm_20000-all

	This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.6652
	- Rouge1: 35.487
	- Rouge2: 15.6713
	- Rougel: 29.9519
	- Rougelsum: 29.9368
	- Gen Len: 19.0

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 24
	- seed: 1799
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 2.1295 \| 0.08 \| 200 \| 1.8266 \| 34.0465 \| 14.7511 \| 29.3395 \| 29.3437 \| 19.0 \|
	\| 1.9354 \| 0.16 \| 400 \| 1.7732 \| 34.7923 \| 15.3094 \| 29.8484 \| 29.8757 \| 18.99 \|
	\| 1.854 \| 0.24 \| 600 \| 1.7367 \| 34.8358 \| 15.1969 \| 29.9971 \| 30.0064 \| 18.986 \|
	\| 1.833 \| 0.32 \| 800 \| 1.7120 \| 34.7854 \| 15.5144 \| 29.8141 \| 29.7863 \| 18.982 \|
	\| 1.8217 \| 0.4 \| 1000 \| 1.7256 \| 34.7274 \| 15.2763 \| 30.0298 \| 30.0871 \| 19.0 \|
	\| 1.8309 \| 0.48 \| 1200 \| 1.7089 \| 35.4328 \| 15.7724 \| 30.0655 \| 30.0199 \| 19.0 \|
	\| 1.825 \| 0.56 \| 1400 \| 1.6947 \| 35.4116 \| 15.6911 \| 30.1438 \| 30.1764 \| 19.0 \|
	\| 1.7914 \| 0.64 \| 1600 \| 1.7119 \| 35.5918 \| 16.3762 \| 30.3234 \| 30.2807 \| 19.0 \|
	\| 1.7889 \| 0.72 \| 1800 \| 1.6810 \| 35.6413 \| 15.8936 \| 30.2848 \| 30.2291 \| 19.0 \|
	\| 1.7576 \| 0.8 \| 2000 \| 1.6826 \| 35.9424 \| 15.6803 \| 30.5998 \| 30.5571 \| 19.0 \|
	\| 1.7763 \| 0.88 \| 2200 \| 1.6748 \| 35.7543 \| 15.984 \| 30.7197 \| 30.721 \| 18.998 \|
	\| 1.7604 \| 0.96 \| 2400 \| 1.6652 \| 35.487 \| 15.6713 \| 29.9519 \| 29.9368 \| 19.0 \|
	\| 1.7138 \| 1.04 \| 2600 \| 1.6860 \| 36.0333 \| 16.4065 \| 30.7249 \| 30.7168 \| 19.0 \|
	\| 1.6951 \| 1.12 \| 2800 \| 1.6792 \| 35.3149 \| 15.7178 \| 30.1555 \| 30.1517 \| 18.998 \|
	\| 1.6752 \| 1.2 \| 3000 \| 1.6832 \| 34.7566 \| 15.4179 \| 29.7687 \| 29.8259 \| 19.0 \|


	### Framework versions

	- Transformers 4.18.0
	- Pytorch 1.10.0+cu111
	- Datasets 2.5.1
	- Tokenizers 0.12.1