oop-de-qg-flan-t5-base-v3

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: nan
Rouge1: 8.0858
Rouge2: 3.0935
Rougel: 7.2494
Rougelsum: 7.3009
Gen Len: 58.0151
Bleu: 0.0107
Precisions: [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]
Brevity Penalty: 1.0
Length Ratio: 4.2235
Translation Length: 15323
Reference Length: 3628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length
No log	1.0	291	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	2.0	582	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	3.0	873	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	4.0	1164	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	5.0	1455	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	6.0	1746	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	7.0	2037	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	8.0	2328	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	9.0	2619	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628
0.0	10.0	2910	nan	8.0858	3.0935	7.2494	7.3009	58.0151	0.0107	[0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]	1.0	4.2235	15323	3628

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.1

LunaticTanuki
/

oop-de-qg-flan-t5-base-v3

oop-de-qg-flan-t5-base-v3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

oop-de-qg-flan-t5-base-v3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from google/flan-t5-base

Evaluation results

Finetuned from