t5-mt-en-ca

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

Loss: 2.2444
Bleu: 1.9924
Gen Len: 17.2964

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	231	3.9148	0.1683	17.2649
No log	2.0	462	3.6731	0.1568	17.6819
4.1865	3.0	693	3.5163	0.2006	17.7144
4.1865	4.0	924	3.3951	0.2983	17.5233
3.7413	5.0	1155	3.2961	0.3487	17.4517
3.7413	6.0	1386	3.2153	0.3698	17.4213
3.5136	7.0	1617	3.1464	0.4649	17.367
3.5136	8.0	1848	3.0885	0.528	17.3181
3.3438	9.0	2079	3.0353	0.5732	17.2638
3.3438	10.0	2310	2.9903	0.6168	17.24
3.226	11.0	2541	2.9470	0.6037	17.2476
3.226	12.0	2772	2.9100	0.6071	17.2856
3.1273	13.0	3003	2.8735	0.7135	17.2562
3.1273	14.0	3234	2.8400	0.7844	17.291
3.1273	15.0	3465	2.8125	0.7642	17.2649
3.0446	16.0	3696	2.7848	0.7874	17.2552
3.0446	17.0	3927	2.7594	0.7701	17.266
2.9717	18.0	4158	2.7335	0.8199	17.317
2.9717	19.0	4389	2.7096	0.8848	17.2812
2.9026	20.0	4620	2.6913	0.9185	17.2942
2.9026	21.0	4851	2.6728	0.9304	17.2997
2.8527	22.0	5082	2.6529	0.9424	17.2758
2.8527	23.0	5313	2.6350	0.9681	17.2801
2.8026	24.0	5544	2.6209	1.065	17.2856
2.8026	25.0	5775	2.6031	1.0636	17.2443
2.7559	26.0	6006	2.5882	1.0406	17.2476
2.7559	27.0	6237	2.5722	1.0967	17.241
2.7559	28.0	6468	2.5621	1.1424	17.2486
2.7094	29.0	6699	2.5472	1.1675	17.2226
2.7094	30.0	6930	2.5356	1.1882	17.2454
2.6703	31.0	7161	2.5226	1.1994	17.2747
2.6703	32.0	7392	2.5116	1.2601	17.266
2.6343	33.0	7623	2.5017	1.2126	17.2389
2.6343	34.0	7854	2.4905	1.2105	17.2432
2.6114	35.0	8085	2.4795	1.2356	17.2215
2.6114	36.0	8316	2.4713	1.2904	17.2497
2.5778	37.0	8547	2.4599	1.291	17.2193
2.5778	38.0	8778	2.4523	1.3017	17.2313
2.5475	39.0	9009	2.4413	1.3076	17.2389
2.5475	40.0	9240	2.4350	1.3536	17.2508
2.5475	41.0	9471	2.4277	1.3899	17.2182
2.5255	42.0	9702	2.4195	1.4112	17.2421
2.5255	43.0	9933	2.4117	1.4328	17.2562
2.4996	44.0	10164	2.4059	1.4373	17.2226
2.4996	45.0	10395	2.3974	1.4887	17.2204
2.4748	46.0	10626	2.3909	1.4829	17.2269
2.4748	47.0	10857	2.3863	1.5417	17.2682
2.4563	48.0	11088	2.3785	1.5502	17.2182
2.4563	49.0	11319	2.3717	1.609	17.2313
2.4363	50.0	11550	2.3661	1.576	17.2573
2.4363	51.0	11781	2.3628	1.61	17.2465
2.4182	52.0	12012	2.3568	1.6118	17.2476
2.4182	53.0	12243	2.3498	1.6268	17.2389
2.4182	54.0	12474	2.3430	1.5769	17.2519
2.4	55.0	12705	2.3404	1.6465	17.2432
2.4	56.0	12936	2.3363	1.6708	17.2508
2.3825	57.0	13167	2.3322	1.6851	17.2714
2.3825	58.0	13398	2.3273	1.6938	17.253
2.3689	59.0	13629	2.3229	1.729	17.2693
2.3689	60.0	13860	2.3187	1.7584	17.2519
2.3586	61.0	14091	2.3144	1.7604	17.2161
2.3586	62.0	14322	2.3101	1.7821	17.2204
2.3433	63.0	14553	2.3072	1.7585	17.2356
2.3433	64.0	14784	2.3027	1.7544	17.2269
2.3294	65.0	15015	2.3009	1.8058	17.2226
2.3294	66.0	15246	2.2964	1.7876	17.2182
2.3294	67.0	15477	2.2941	1.7765	17.2476
2.3129	68.0	15708	2.2898	1.747	17.2541
2.3129	69.0	15939	2.2878	1.7628	17.2486
2.3102	70.0	16170	2.2845	1.7721	17.2345
2.3102	71.0	16401	2.2829	1.803	17.2334
2.2949	72.0	16632	2.2786	1.7698	17.2161
2.2949	73.0	16863	2.2754	1.786	17.2302
2.2895	74.0	17094	2.2746	1.7973	17.2552
2.2895	75.0	17325	2.2710	1.7891	17.2747
2.2803	76.0	17556	2.2709	1.8304	17.2497
2.2803	77.0	17787	2.2682	1.822	17.2443
2.2697	78.0	18018	2.2653	1.819	17.2736
2.2697	79.0	18249	2.2634	1.8169	17.279
2.2697	80.0	18480	2.2619	1.8322	17.2747
2.2649	81.0	18711	2.2612	1.8546	17.2541
2.2649	82.0	18942	2.2582	1.868	17.2986
2.2582	83.0	19173	2.2575	1.9165	17.2856
2.2582	84.0	19404	2.2563	1.9389	17.2725
2.2556	85.0	19635	2.2543	1.9548	17.2834
2.2556	86.0	19866	2.2528	1.9543	17.2932
2.2516	87.0	20097	2.2512	1.9483	17.2856
2.2516	88.0	20328	2.2506	1.9439	17.2942
2.2475	89.0	20559	2.2499	1.9672	17.2801
2.2475	90.0	20790	2.2490	1.9569	17.2866
2.2373	91.0	21021	2.2479	1.9708	17.2671
2.2373	92.0	21252	2.2468	1.9655	17.2834
2.2373	93.0	21483	2.2461	1.9695	17.2845
2.2399	94.0	21714	2.2455	1.9703	17.2888
2.2399	95.0	21945	2.2453	1.9728	17.2877
2.2381	96.0	22176	2.2453	1.9734	17.2758
2.2381	97.0	22407	2.2447	1.9855	17.2921
2.237	98.0	22638	2.2444	1.9912	17.2975
2.237	99.0	22869	2.2445	1.9924	17.2964
2.2283	100.0	23100	2.2444	1.9924	17.2964

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3

judithrosell
/

t5-mt-en-ca

t5-mt-en-ca

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train judithrosell/t5-mt-en-ca

Evaluation results