jerome1519
/

flan-t5-large-finetuned-coding_instructions_2023_08_18__12_06

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

flan-t5-large-finetuned-coding_instructions_2023_08_18__12_06 / README.md

Jérôme Bau

update model card README.md

ebd5cf8 over 1 year ago

|

history blame contribute delete

2.57 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flan-t5-large-finetuned-coding_instructions_2023_08_18__12_06
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-large-finetuned-coding_instructions_2023_08_18__12_06

	This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6230
	- Rouge1: 47.0864
	- Rouge2: 31.2968
	- Rougel: 45.9675
	- Rougelsum: 46.0612
	- Gen Len: 19.0

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| No log \| 1.0 \| 10 \| 0.9891 \| 18.047 \| 9.6197 \| 18.1466 \| 18.2622 \| 16.9538 \|
	\| No log \| 2.0 \| 20 \| 0.7803 \| 21.724 \| 12.8839 \| 21.4666 \| 21.6773 \| 17.7385 \|
	\| No log \| 3.0 \| 30 \| 0.6827 \| 42.1883 \| 27.0064 \| 41.5285 \| 41.6611 \| 18.9077 \|
	\| No log \| 4.0 \| 40 \| 0.6526 \| 44.8257 \| 28.8931 \| 43.8323 \| 43.7858 \| 18.9846 \|
	\| No log \| 5.0 \| 50 \| 0.6407 \| 44.6781 \| 29.5477 \| 43.9053 \| 43.8475 \| 19.0 \|
	\| No log \| 6.0 \| 60 \| 0.6334 \| 46.039 \| 31.3315 \| 45.3508 \| 45.3701 \| 19.0 \|
	\| No log \| 7.0 \| 70 \| 0.6281 \| 46.8592 \| 31.2186 \| 46.1283 \| 46.1169 \| 19.0 \|
	\| No log \| 8.0 \| 80 \| 0.6250 \| 46.5201 \| 30.8844 \| 45.5541 \| 45.6876 \| 19.0 \|
	\| No log \| 9.0 \| 90 \| 0.6236 \| 47.074 \| 31.2968 \| 46.1336 \| 46.258 \| 19.0 \|
	\| No log \| 10.0 \| 100 \| 0.6230 \| 47.0864 \| 31.2968 \| 45.9675 \| 46.0612 \| 19.0 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.4
	- Tokenizers 0.13.3