long-t5-local-base-ARv1

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.9303
Exact Match: 18.0
Gen Len: 3.38

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match	Gen Len
No log	1.0	7	3.4004	14.0	3.86
2.7206	2.0	14	3.1925	8.0	3.66
2.6501	3.0	21	2.9867	8.0	3.7
2.6501	4.0	28	2.8576	12.0	4.58
1.9849	5.0	35	2.9078	12.0	4.52
2.0193	6.0	42	2.8173	8.0	3.84
2.0193	7.0	49	2.7735	16.0	3.42
1.6108	8.0	56	2.5993	12.0	3.82
1.8323	9.0	63	2.5879	12.0	3.92
1.4861	10.0	70	2.7203	16.0	3.4
1.4861	11.0	77	2.9902	24.0	3.1
1.425	12.0	84	2.7667	14.0	3.36
1.0387	13.0	91	2.6547	18.0	3.42
1.0387	14.0	98	2.7072	18.0	3.34
1.0793	15.0	105	2.8158	12.0	3.58
1.1969	16.0	112	2.9404	14.0	3.32
1.1969	17.0	119	2.8512	14.0	3.3
1.15	18.0	126	2.7513	18.0	3.68
1.2024	19.0	133	2.7124	16.0	3.48
1.3331	20.0	140	2.7484	16.0	3.4
1.3331	21.0	147	2.8289	18.0	3.44
1.1469	22.0	154	2.9873	14.0	3.36
1.5639	23.0	161	3.0321	18.0	3.4
1.5639	24.0	168	3.0117	14.0	3.3
0.8542	25.0	175	2.8331	16.0	3.34
0.9789	26.0	182	2.7876	20.0	3.36
0.9789	27.0	189	2.7820	20.0	3.36
0.8853	28.0	196	2.8082	18.0	3.38
0.9126	29.0	203	2.8316	16.0	3.36
1.0543	30.0	210	2.8449	18.0	3.64
1.0543	31.0	217	2.8034	8.0	3.62
1.0683	32.0	224	2.8115	14.0	3.46
0.951	33.0	231	2.9019	18.0	3.34
0.951	34.0	238	3.0115	18.0	3.24
0.8315	35.0	245	3.0392	18.0	3.24
1.1548	36.0	252	3.0643	18.0	3.36
1.1548	37.0	259	3.0031	16.0	3.42
0.7813	38.0	266	2.9801	18.0	3.48
0.671	39.0	273	2.9622	18.0	3.48
1.1771	40.0	280	2.9049	18.0	3.46
1.1771	41.0	287	2.9042	20.0	3.56
0.5959	42.0	294	2.9598	18.0	3.48
1.1583	43.0	301	2.9936	18.0	3.44
1.1583	44.0	308	3.0072	18.0	3.44
0.5728	45.0	315	3.0003	18.0	3.44
0.7237	46.0	322	3.0093	16.0	3.4
0.7237	47.0	329	2.9688	18.0	3.42
0.7295	48.0	336	2.9533	18.0	3.38
0.5627	49.0	343	2.9357	18.0	3.36
0.6489	50.0	350	2.9317	18.0	3.4
0.6489	51.0	357	2.9339	18.0	3.4
1.0427	52.0	364	2.9256	18.0	3.4
0.9156	53.0	371	2.9220	18.0	3.4
0.9156	54.0	378	2.9091	18.0	3.38
0.4748	55.0	385	2.9036	18.0	3.36
0.5616	56.0	392	2.8998	18.0	3.36
0.5616	57.0	399	2.9128	18.0	3.36
0.4836	58.0	406	2.9205	18.0	3.36
0.6498	59.0	413	2.9282	18.0	3.36
0.615	60.0	420	2.9303	18.0	3.38

Framework versions

Transformers 4.41.0
Pytorch 2.2.1
Datasets 2.19.1
Tokenizers 0.19.1

G-R-A-V-I-T-Y
/

long-t5-local-base-ARv1

long-t5-local-base-ARv1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for G-R-A-V-I-T-Y/long-t5-local-base-ARv1

Evaluation results