metadata

license: apache-2.0
base_model: PRAli22/arat5-arabic-dialects-translation
tags:
  - generated_from_trainer
model-index:
  - name: t5-finetuned-ar-to-arsl_test
    results: []

t5-finetuned-ar-to-arsl_test

This model is a fine-tuned version of PRAli22/arat5-arabic-dialects-translation on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.4437
Bleu1: 0.9326
Bleu2: 0.8967
Bleu3: 0.7133
Bleu4: 0.5737

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu1	Bleu2	Bleu3	Bleu4
No log	1.0	59	0.3274	0.9290	0.8943	0.7134	0.5740
No log	2.0	118	0.3396	0.9332	0.9022	0.7185	0.5775
No log	2.99	177	0.3654	0.9331	0.9001	0.7165	0.5754
No log	3.99	236	0.3809	0.9298	0.8951	0.7096	0.5690
No log	4.99	295	0.3918	0.9325	0.8984	0.7153	0.5747
No log	5.99	354	0.4003	0.9294	0.8926	0.7082	0.5691
No log	6.99	413	0.4018	0.9331	0.8984	0.7137	0.5738
No log	8.0	473	0.4154	0.9333	0.9007	0.7161	0.5776
0.0263	9.0	532	0.4394	0.9338	0.8985	0.7142	0.5745
0.0263	10.0	591	0.4421	0.9336	0.8994	0.7176	0.5781
0.0263	10.99	650	0.4417	0.9325	0.8971	0.7138	0.5753
0.0263	11.99	709	0.4526	0.9340	0.8992	0.7154	0.5740
0.0263	12.99	768	0.4487	0.9328	0.8971	0.7134	0.5734
0.0263	13.99	827	0.4483	0.9324	0.8970	0.7135	0.5740
0.0263	14.97	885	0.4437	0.9326	0.8967	0.7133	0.5737

Framework versions

Transformers 4.39.3
Pytorch 2.1.2
Datasets 2.18.0
Tokenizers 0.15.2