cartesinus
/

iva_mt_wslot-m2m100_418M-en-es-plaintext_10e

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

iva_mt_wslot-m2m100_418M-en-es-plaintext_10e

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the evaluation set:

Loss: 0.0116
Bleu: 51.1501
Gen Len: 12.6861

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.012	1.0	2104	0.0109	47.9124	12.7523
0.0079	2.0	4208	0.0101	49.9897	12.6763
0.0059	3.0	6312	0.0101	50.5286	12.6435
0.0045	4.0	8416	0.0101	49.6821	12.5472
0.0033	5.0	10520	0.0104	50.3856	12.6638
0.0024	6.0	12624	0.0107	50.359	12.7418
0.0019	7.0	14728	0.0111	50.8234	12.709
0.0014	8.0	16832	0.0111	50.872	12.6671
0.0011	9.0	18936	0.0114	51.3014	12.6291
0.001	10.0	21040	0.0116	51.1501	12.6861

Framework versions

Transformers 4.28.1
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3

Downloads last month: 3

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

Bleu on iva_mt_wslot
validation set self-reported

51.150

View on Papers With Code