gsarti
/

it5-efficient-small-el32-question-answering

Text2Text Generation

sequence-to-sequence

text2text-question-answering

Inference Endpoints

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

it5-efficient-small-el32-question-answering / README.md

gsarti's picture

Initial commit

8ab6b9a about 2 years ago

|

raw history blame

No virus

2.56 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- it5/datasets
	metrics:
	- rouge
	model-index:
	- name: it5-efficient-small-el32-qa-0.0003
	results:
	- task:
	name: Summarization
	type: summarization
	dataset:
	name: it5/datasets qa
	type: it5/datasets
	args: qa
	metrics:
	- name: Rouge1
	type: rouge
	value: 74.2234
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# it5-efficient-small-el32-qa-0.0003

	This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets qa dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8225
	- Rouge1: 74.2234
	- Rouge2: 40.5909
	- Rougel: 74.1327
	- Rougelsum: 74.2081
	- Gen Len: 4.7055

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 7.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 1.1164 \| 0.8 \| 5000 \| 0.8244 \| 66.4678 \| 35.3554 \| 66.4543 \| 66.4522 \| 4.541 \|
	\| 0.9097 \| 1.59 \| 10000 \| 0.7299 \| 70.0574 \| 37.5535 \| 69.9512 \| 70.0084 \| 4.5548 \|
	\| 0.6637 \| 2.39 \| 15000 \| 0.7314 \| 72.0767 \| 39.2263 \| 72.0257 \| 72.0473 \| 4.703 \|
	\| 0.5015 \| 3.19 \| 20000 \| 0.7147 \| 73.0185 \| 39.9998 \| 72.9347 \| 72.9576 \| 4.75 \|
	\| 0.5101 \| 3.99 \| 25000 \| 0.7055 \| 73.7898 \| 40.5481 \| 73.7235 \| 73.7901 \| 4.8728 \|
	\| 0.3903 \| 4.78 \| 30000 \| 0.7442 \| 74.0845 \| 39.9841 \| 74.0172 \| 74.0635 \| 4.5938 \|
	\| 0.2993 \| 5.58 \| 35000 \| 0.8184 \| 73.8405 \| 40.2569 \| 73.7756 \| 73.7972 \| 4.7412 \|
	\| 0.2227 \| 6.38 \| 40000 \| 0.8278 \| 74.0159 \| 40.6403 \| 73.9412 \| 73.9722 \| 4.742 \|


	### Framework versions

	- Transformers 4.15.0
	- Pytorch 1.10.0+cu102
	- Datasets 1.17.0
	- Tokenizers 0.10.3