Edit model card

bart-base-en-to-de

This model is a fine-tuned version of ahazeemi/bart-base-finetuned-en-to-de on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9665
  • Bleu: 4.7851
  • Gen Len: 19.453

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.319 0.04 5000 1.1247 4.4467 19.447
1.295 0.07 10000 1.1012 4.4235 19.458
1.2901 0.11 15000 1.0923 4.4386 19.4423
1.2678 0.14 20000 1.0803 4.5259 19.4557
1.267 0.18 25000 1.0724 4.5534 19.4653
1.2444 0.21 30000 1.0591 4.4944 19.4623
1.2365 0.25 35000 1.0509 4.5736 19.446
1.2137 0.28 40000 1.0400 4.5346 19.4553
1.214 0.32 45000 1.0340 4.5733 19.4543
1.218 0.35 50000 1.0283 4.6076 19.4693
1.2118 0.39 55000 1.0225 4.6192 19.454
1.1948 0.43 60000 1.0152 4.6082 19.4553
1.1932 0.46 65000 1.0128 4.665 19.449
1.1889 0.5 70000 1.0028 4.6929 19.4493
1.2154 0.53 75000 1.0004 4.7151 19.4477
1.194 0.57 80000 0.9950 4.6655 19.467
1.1847 0.6 85000 0.9966 4.708 19.451
1.1848 0.64 90000 0.9897 4.7794 19.458
1.1762 0.67 95000 0.9866 4.7204 19.4523
1.1818 0.71 100000 0.9803 4.7137 19.458
1.1613 0.75 105000 0.9788 4.7652 19.4573
1.1738 0.78 110000 0.9775 4.8088 19.453
1.1569 0.82 115000 0.9752 4.7522 19.4577
1.1631 0.85 120000 0.9713 4.7301 19.4513
1.1517 0.89 125000 0.9690 4.7935 19.456
1.1577 0.92 130000 0.9686 4.791 19.4543
1.1607 0.96 135000 0.9676 4.7529 19.4533
1.153 0.99 140000 0.9665 4.7851 19.453

Framework versions

  • Transformers 4.22.2
  • Pytorch 1.12.0+cu116
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
3